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Preface 



ASIACRYPT 2000 was the sixth annual ASIACRYPT conference. It was spon- 
sored by the International Association for Cryptologic Research (lACR) in co- 
operation with the Institute of Electronics, Information, and Communication 
Engineers (lEICE). 

The first conference with the name ASIACRYPT took place in 1991, and the 
series of ASIACRYPT conferences were held in 1994, 1996, 1998, and 1999, in 
cooperation with lACR. ASIACRYPT 2000 was the first conference in the series 
to be sponsored by lACR. 

The conference received 140 submissions (1 submission was withdrawn by 
the authors later), and the program committee selected 45 of these for presenta- 
tion. Extended abstracts of the revised versions of these papers are included in 
these proceedings. The program also included two invited lectures by Thomas 
Berson (Cryptography Everywhere: lACR Distinguished Lecture) and Hideki 
Imai (CRYPTREC Project - Cryptographic Evaluation Project for the Japanese 
Electronic Government) . Abstracts of these talks are included in these proceed- 
ings. 

The conference program also included its traditional “rump session” of short, 
informal or impromptu presentations, kindly chaired by Moti Yung. Those pre- 
sentations are not reflected in these proceedings. 

The selection of the program was a challenging task as many high quality 
submissions were received. The program committee worked very hard to evaluate 
the papers with respect to quality, originality, and relevance to cryptography. 

I am extremely grateful to the program committee members for their enor- 
mous investment of time and effort in the difficult and delicate process of review 
and selection. 

I gratefully acknowledge the help of a large member of colleagues who re- 
viewed submissions in their area of expertise: Masayuki Abe, Harald Baier, 
Olivier Baudron, Mihir Bellare, John Black, Michelle Boivin, Seong-Taek Chee, 
Ronald Cramer, Claude Crepeau, Pierre-Alain Fouque, Louis Granboulan, Sa- 
fuat Hamdy, Goichiro Hanaoka, Birgit Henhapl, Mike Jacobson, Masayuki Kanda, 
Jonathan Katz, Dennis Kuegler, Dong-Hoon Lee, Markus Maurer, Bodo Moeller, 
Phong Nguyen, Satoshi Obana, Thomas Pfahler, John O. Pliam, David Pointch, 
Guillaume Poupard, Junji Shikata, Holger Vogt, Ullrich Vollmer, Yuji Watanabe, 
Annegret Weng, and Seiji Yoshimoto. 

An electronic submission process was available and recommended. I would 
like to thank Kazumaro Aoki, who did an excellent job in running the electronic 
submission system of the ACM SIGACT group and in making a support system 
for the review process of the PC members. Special thanks to many people who 
supported him: Seiichiro Hangai and Christian Cachin for their web page sup- 
ports, Joe Kilian for giving him a MIME parser, Steve Tate for supporting the 
SIGACT package, Wim Moreau for consulting their electronic review system. 
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and Masayuki Abe for scanning non-electronic submissions. Special thanks go 
to Mami Yamaguchi and Junko Taneda for their support in arranging review 
reports and editing these proceedings. 

I would like to thank Tsutomu Matsumoto, general chair, and the members of 
organizing committee: Seiichiro Hangai, Shouichi Hirose, Daisuke Inoue, Keiichi 
Iwamura, Masayuki Kanda, Toshinobu Kaneko, Shinichi Kawamura, Michiharu 
Kudo, Hidenori Kuwakado, Masahiro Mambo, Mitsuru Matsui, Natsume Mat- 
suzaki, Atsuko Miyaji, Shiho Moriai, Eiji Okamoto, Kouichi Sakurai, Fumihiko 
Sano, Atsushi Shimbo, Takeshi Shimoyama, Hiroki Shizuya, Nobuhiro Tagashira, 
Kazuo Takaragi, Makoto Tatebayashi, Toshio Tokita, Naoya Torii. We are es- 
pecially grateful to Shigeo Tsujii and Hideki Imai for their great support of the 
organizing committee. 

The organizing committee gratefully acknowledges the financial contributions 
of the two organizations. Initiatives in Research of Information Security (IRIS) 
and the Telecommunications Advancement Organization (TAF), as well as many 
companies. 

I wish to thank all the authors who by submitting papers made this confer- 
ence possible, and the authors of accepted papers for their cooperation. 

Finally, I would like to dedicate these proceedings to the memory of Kenji 
Koyama, who passed away in March 2000. He was 50 years old. He was one 
of the main organizers of the first ASIACRYPT conference held in Japan in 
1991, and devoted himself to make lACR the sponsor of ASIACRYPT. He was 
looking forward to ASIACRYPT 2000 very much, since it was the first of the 
ASIACRYPT conference series sponsored by lACR. May he rest in peace. 
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Tatsuaki Okamoto 
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Cryptanalytic Time/Memory /Data Tradeoffs for 

Stream Ciphers 



Alex Biryukov and Adi Shamir 



Computer Science Department 
The Weizmann Institute 
Rehovot 76100, Israel. 



Abstract. In 1980 Heilman introduced a general technique for breaking 
arbitrary block ciphers with N possible keys in time T and memory M 
related by the tradeoff curve for 1 < T < A. Recently, 

Babbage and Colic pointed out that a different TM = N tradeoff attack 
for 1 < T < D is applicable to stream ciphers, where D is the amount 
of output data available to the attacker. In this paper we show that a 
combination of the two approaches has an improved time/memory/data 
tradeoff for stream ciphers of the form TM^D^ = for any < 
T < A. In addition, we show that stream ciphers with low sampling 
resistance have tradeoff attacks with fewer table lookups and a wider 
choice of parameters. 

Keywords: Cryptanalysis, stream ciphers, time/memory tradeoff at- 
tacks. 



1 Introduction 

There are two major types of symmetric cryptosystems: Block ciphers (which 
encrypt a plaintext block into a ciphertext block by mixing it in an invertible 
way with a fixed key), and stream ciphers (which use a finite state machine 
initialized with the key to produce a long pseudo random bit string, which is 
XOR’ed with the plaintext to obtain the ciphertext). 

Block and stream ciphers have different design principles, different attacks, 
and different measures of security. The open cryptanalytic literature contains 
many papers on the resistance of block ciphers to differential and linear attacks, 
on their avalanche properties, on the properties of Feistel or S-P structures, 
on the design of S-boxes and key schedules, etc. The relatively few papers on 
stream ciphers tend to concentrate on particular ciphers and on particular at- 
tacks against them. Among the few unifying ideas in this area are the use of linear 
feedback shift registers as bit generators, and the study of the linear complexity 
and correlation immunity of the ciphers. 

In this paper we concentrate on a general type of cryptanalytic attack known 
as a time/memory tradeoff attack. Such an attack has two phases: During the 
preprocessing phase (which can take a very long time) the attacker explores the 
general structure of the cryptosystem, and summarizes his findings in large tables 
(which are not tied to particular keys). During the realtime phase, the attacker 

T. Okamoto (Ed.): ASIACRYPT 2000, LNCS 1976, pp. 1-13, 2000. 
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is given actual data produced from a particular unknown key, and his goal is to 
use the precomputed tables in order to find the key as quickly as possible. 

In any time-memory tradeoff attack there are five key parameters: 

— N represents the size of the search space. 

— P represents the time required by the preprocessing phase of the attack. 

— M represents the amount of random access memory (in the form of hard 

disks or DVD’s) available to the attacker. 

— T represents the time required by the realtime phase of the attack. 

— D represents the amount of realtime data available to the attacker. 

2 Tradeoff Attacks on Block and Stream Ciphers 

In the case of block ciphers, the size N of the search space is the number of 
possible keys. We assume that the number of possible plaintexts and ciphertexts 
is also N , and that the given data is a single ciphertext block produced from a 
fixed chosen plaintext block. The best known time/memory tradeoff attack is due 
to Heilman [5] . It uses any combination of parameters which satisfy the following 
relationships: TM^ = P = N , D = 1 (see Section 3 for further details). The 
optimal choice of T and M depends on the relative cost of these computational 
resources. By choosing T = M, Heilman gets the particular tradeoff point T = 
iV2/3 and M = iV2/3. 

Heilman’s attack is applicable to any block cipher whose key to ciphertext 
mapping (for a fixed plaintext) behaves as a random function / over a space of 
N points. If this function happens to be an invertible permutation, the tradeoff 
relation becomes TM = N, which is even better. An interesting property of 
Heilman’s attack is that even if the attacker is given a large number D of chosen 
plaintext/ciphertext pairs, it is not clear how to use them in order to improve 
the attack. 

Stream ciphers have a very different behavior with respect to time/memory 
tradeoff attacks. The size N of the search space is determined by the number 
of internal states of the bit generator, which can be different from the number 
of keys. The realtime data typically consists of the first D pseudorandom bits 
produced by the generator, which are computed by XOR’ing a known plaintext 
header and the corresponding ciphertext bits (there is no difference between a 
known and a chosen plaintext attack in this case). The goal of the attacker is to 
find at least one of the actual states of the generator during the generation of 
this output, after which he can run the generator forwards an unlimited number 
of steps, produce all the later pseudorandom bits, and derive the rest of the 
plaintext. Note that in this case there is no need to run the generator backwards 
or to find the original key, even though this is doable in many practical cases. 

The simplest time/memory tradeoff attack on stream ciphers was indepen- 
dently described by Babbage [2] and Golic [4], and will be referred to as the BG 
attack. It associates with each one of the N possible states of the generator the 
string consisting of the first log{N) bits produced by the generator from that 
state. This mapping f{x) = y from states x to output prefixes y can be viewed as 
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a random function over a common space of N points, which is easy to evaluate 
but hard to invert. The goal of the attacker is to invert it on some substring 
of the given output, in order to recover the corresponding internal state. The 
preprocessing phase of the attack picks M random Xi states, computes their 
corresponding yi output prefixes, and stores all the {xi,yi) pairs in a random 
access memory, sorted into increasing order of yi. The realtime phase of the at- 
tack is given a prefix oi D + log{N) — 1 generated bits, and derives from it all 
the D possible windows yi,y2, ■■■,yD of log{N) consecutive bits (with overlaps). 
It lookups each yj from the data in logarithmic time in the sorted table. If at 
least one yj is found in the table, its corresponding xj makes it possible to de- 
rive the rest of the plaintext by running the generator forwards from this known 
state^. The threshold of success for this attack can be derived from the birth- 
day paradox, which states that two random subsets of a space with N points 
are likely to intersect when the product of their sizes exceeds N. If we ignore 
logarithmic factors, this condition becomes DM = N where the preprocessing 
time is P = M and the attack time is T = D. This represents one particular 
point on the time/memory tradeoff curve TM = N . By ignoring some of the 
available data during the actual attack, we can reduce T from D towards 1, and 
thus generalize the tradeoff to TM = N and P = M for any 1 < T < D. 

This TM = N tradeoff is similar to Heilman’s TM = N tradeoff for random 
permutations and better than Heilman’s = N'^ tradeoff for random func- 
tions (when T = M we get T = M = instead oiT = M = However, 

this formal comparison is misleading since the two tradeoffs are completely dif- 
ferent: they are applicable to different types of cryptosystems (stream vs. block 
ciphers), are valid in different parameter ranges (l<T<Dvs. 1<T< N), 
and require different amounts of data (about D bits vs. a single chosen plain- 
text/ciphertext pair). 

To understand the fundamental difference between tradeoff attacks on block 
ciphers and on stream ciphers, consider the problem of using a large value of 
D to speed up the attack. The mapping defined by a block cipher has two 
inputs (key and plaintext block) and one output (ciphertext block). Since each 
precomputed table in Heilman’s attack on block ciphers is associated with a 
particular plaintext block, we cannot use a common table to simultaneously 
analyse different ciphertext blocks (which are necessarily derived from different 
plaintext blocks during the lifetime of a single key). The mapping defined by a 
stream cipher, on the other hand, has one input (state) and one output (an ouput 
prefix), and thus has a single “flavour”: When we try to invert it on multiple 
output prefixes, we can use the same precomputed tables in all the attempts. 
As a result, tradeoff attacks on stream ciphers can be much more efficient than 
tradeoff attacks on block ciphers when D is large, but this possibility had not 
been explored so far in the research literature. 



^ Note that yj may have multiple predecessors, and thus Xj may be different from the 
state we look for. However, it can be shown that these “false alarms” increase the 
complexity of the attack by only a small constant factor. 
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3 Combining the Two Tradeoff Attacks 

In this section we show that it is possible to combine the two types of tradeoff 
attacks to obtain a new attack on stream ciphers whose parameters satisfy the 
relation P = N/D and TM'^D'^ = N'^ for any < T < N. A typical point 
on this tradeoff relation is P = preprocessing time, T = attack 

time, M = disk space, and D = available data. For N = 2^™ the 
parameters P = T = 2®® and M = D = 2®® are all (barely) feasible, whereas the 
Heilman attack with T = M = = 2®® requires an unrealistic amount of disk 

space M, and the BG attack with T = D = = 2®® and M = = 2®® 

requires an unrealistic amount of data D. 



3.1 Heilman’s Time/Memory Tradeoff Attack on Block Ciphers 

The starting point of the new attack on stream ciphers is Heilman’s original 
tradeoff attack on block ciphers, which considers the random function / that 
maps the key x to the ciphertext block y for some fixed chosen plaintext. This / is 
easy to evaluate but hard to invert, since the problem of computing x = f~^{y) is 
exactly the cryptanalytic problem of deriving the key x from the given ciphertext 
block y. 

To perform this difficult inversion of / with an algorithm which is faster than 
exhaustive search, Heilman uses a preprocessing stage which tries to cover the N 
points of the space with a rectangular mxt matrix whose rows are long paths ob- 
tained by iterating the function / t times on m randomly chosen starting points. 
The startpoints are described by the leftmost column of the matrix, and the 
corresponding endpoints are described by the rightmost column of the matrix 
(see Fig. 1). The output of the preprocessing stage is the collection of (start- 
point, endpoint) pairs of all the chosen paths, sorted into increasing endpoint 
values. During the actual attack, we are given a value y and are asked to find its 
predecessor x under /. If this x is covered by one of the precomputed paths, the 
algorithm repeatedly applies f to y until it reaches the stored endpoint, jumps 
to its associated startpoint, and repeatedly applies / to the startpoint until it 
reaches y again. The previous point it visits is the desired x. 

A single matrix cannot efficiently cover all the N points, (in particular, the 
only way we can cover the approximately N/e leaves of a random directed graph 
is to choose them as starting points). As we add more rows to the matrix, 
we reach a situation in which we start to re-cover points which are already 
covered, which makes the coverage increasingly wasteful. To find this critical 
value of m, assume that the first m paths are all disjoint, but the next path 
has a common point with one of the previous paths. The first m paths contain 
exactly mt distinct points (since they are assumed to have no repetitions), and 
the additional path is likely to contain exactly t distinct points (assuming that t 
is less than \/N). By the birthday paradox, the two sets are likely to be disjoint 
as long as t ■ mt < N, and thus we choose m and t which satisfy the relation 
= N, which we call the matrix stopping rule. 




Cryptanalytic Time/Memory/Data Tradeoffs for Stream Ciphers 



5 



m . 

startpoints 





m 

endpoints 



length t 



Fig. 1. Heilman’s Matrix 



A single m x t matrix with mt^ = N covers only a fraction of mt/N = 
1/t oi the space, and thus we need t “unrelated” matrices to cover the whole 
space. Heilman’s great insight was the observation that we can use variants fi 
of the original / defined by fi{x) = hi{f{x)) where hi is some simple output 
modification (e.g., reordering the bits of f{x)). These modified variants of / 
have the following properties: 

1. The points in the matrices of fi and fj for i yf j are essentially independent, 
since the existence of a common point in two different matrices does not 
imply that subsequent points on the two paths must also be equal. Conse- 
quently, the union of t matrices (each covering mt points) is likely to contain 
a fixed fraction of the space. 

2. The problem of computing x from the given y = f{x) can be solved by 
inverting any one of the modified functions fi over the modified point yi = 
f^{x) = h,{f{x). 

3. The value of yi = fi{x) can be computed even when we do not know x by 
applying hi to the given y = f{x). 

The total precomputation requires P N time, since we have to cover a 
fixed fraction of the space in all the precomputed paths. Each matrix covers 
mt points, but can be stored in m memory locations since we only keep the 
startpoint and endpoint of each path. The total memory required to store the 
t matrices is thus M = mt. The given y is likely to be covered by only one of 
the precomputed matrices, but since we do not know where it is located we have 
to perform t inversion attempts, each requiring t evaluations of some fi. The 
total time complexity of the actual attack is thus T = To find the tradeoff 
curve between T and M , we use the matrix stopping rule mt^ = to conclude 
that TM^ = t^ ■ m?t^ = 7V^. Note that in this tradeoff formula the time T can 
be anywhere in the range I < T < N , but the space M should be restricted 
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to < M < N, since otherwise T > N and thus the attack is slower than 
exhaustive search. 



3.2 An Improved Attack on Stream Ciphers 

As explained earlier in this paper, the main difference between tradeoff attacks 
on block ciphers and on stream ciphers is that in a block cipher each given 
ciphertext requires the inversion of a different function, whereas in a stream 
cipher all the given output prefixes can be inverted with respect to the same 
function by using the same precomputed tables. 

To adapt Heilman’s attack from block ciphers to stream ciphers, we use the 
same basic approach of covering the N points by matrices defined by multi- 
ple variants ft of the function / which represents the state to prefix mapping. 
Note that partially overlapping prefixes do not necessarily represent neighboring 
points in the graph defined by the iterations of /, and thus they can be viewed 
as unrelated random points in the graph. The attack is successful if any one 
of the D given output values is found in any one of the matrices, since we can 
then find some actual state of the generator which can be run forward beyond 
the known prefix of output bits. We can thus reduce the total number of points 
covered by all the matrices from about N to N/D points, and still get (with 
high probability) a collision between the stored and actual states. 

There are two possible ways to reduce the number of states covered by the 
matrices: By making each matrix smaller, or by choosing fewer matrices. Since 
each evaluation step of fi adds m states to the coverage, it is wasteful to choose m 
or t which are smaller than the maximum values allowed by the matrix stopping 
rule mt^ = N . Our new tradeoff thus keeps each matrix as large as possible, 
and reduces the number of matrices from t to t/ D in order to decrease the total 
coverage of all the matrices by a factor of D. However, this is possible only when 
t > D, since if we try to reduce the number of tables to less than 1, we are forced 
to use suboptimal values of m and t, and thus enter a less efficient region of the 
tradeoff curve. 

Each matrix in the new attack requires the same storage size m as before, 
but the total memory required to store all the matrices is reduced from M = mt 
to M = mt/D. The total preprocessing time is similarly reduced from P = N to 
P = N/D, since we have to evaluate only 1/D of the previous number of paths. 
The attack time T is the product of the number of matrices, the length of each 
path, and the number of available data points, since we have to iterate each one 
of the t/ D functions fi on each one of the D given output prefixes up to t times. 
This product is T = which is the same as in Heilman’s original attack. 

To find the time/memory/data tradeoff in this attack, we again use the ma- 
trix stopping rule mt^ = A in order to eliminate the parameters m and t from 
the various expressions. The preprocessing time is P = N/D, which is already 
free from these parameters. The time T = memory M = mt/D, and data D 
clearly satisfy the invariant relationship: 



TM^D^ = ■ {mH‘^/D‘^) ■ D^ = mH^ = N^ 
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This relationship is valid for any t > D, and thus for any < T < N. In 
particular, we can use the parameters P = T = M = D = which 

seems to be practical for N up to about 100. 

4 Time/Memory/Data Tradeoff Attacks with Sampling 

One practical problem with tradeoff attacks is that random access to a hard 
disk requires about 8 milliseconds, whereas a computational step on a fast PC 
requires less than 2 nanoseconds. This speed ratio of four million makes it crucial 
to minimize the number of disk operations we perform, in addition to reducing 
the number of evaluations of ft. An old idea due to Ron Rivest was to reduce 
the number of table lookups in Heilman’s attack by defining a subset of special 
points whose names start with a fixed pattern such as k zero bits. 

Special points are easy to generate and to recognize. During the preprocessing 
stage of Heilman’s attack, we start each path from a randomly chosen point, and 
stop it only when we encounter another special point (or enter a loop, which is 
unlikely when t < \/N). Consequently, we know that the disk contains only 
special endpoints. If we choose k = log{t), the expected length of each path 
remains t (with some variability), and the set of mt endpoints we store in all the 
t tables contains a large fraction of the N/t possible special points. 

The main advantage of this approach is that during the actual attack, we 
have to perform only one expensive disk operation per path (when we encounter 
the first special point on it). The number of evaluations of fi remains T = 
but the number of disk operations is reduced from to t, which makes a huge 
practical difference. 

Can we use a similar sampling of special points in tradeoff attacks on stream 
ciphers? Consider first the case of the BG tradeoff with TM = N, P = M, 
and 1 < T < D. We say that an output prefix is special if it starts with a 
certain number of zero bits, and that a state of the stream cipher is special if 
it generates a special output prefix. We would like to store in the disk during 
preprocessing only special pairs of (state, output prefix). Unlike the case of 
Heilman’s attack (where special states appeared on sufficiently long paths with 
reasonable probability, and acted as natural path terminators), in the BG attack 
we deal with degenerate paths of length 1 (from a state to its immediate output 
prefix), and thus we have to use trial and error in order to find special states. 

Assume that the ratio between the number of special states and all the states 
is R, where 0 < R < 1. Then to find the M special states we would like to store 
during preprocessing, we have to try a much larger number M/R of random 
states, which increases the preprocessing time from P = M to P = M/R. The 
attack time reduces from T = D to T = DR, since only the special points in the 
given data (which are very easy to spot) have to be looked up in the disk. To 
make it likely to have a collision between the M special states stored in the disk 
and the DR special states in the data, we have to apply the birthday paradox 
to the smaller set oi NR special states to obtain MDR = NR. The invariant 
satisfied for all the possible values of R is thus 
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TP = MD = N for 1<T <D 

An interesting consequence of this tradeoff formula is that the sampling tech- 
nique had turned the original BG time/memory tradeoff (TM = N) into two 
independent time/preprocessing {TP = N) and memory/data {MD = N) trade- 
offs, which are controlled by the three parameters m, t, and R. For N = 
the first condition is easy to satisfy, since both the preprocessing time P and the 
actual time T can be chosen as 2^'^. However, the second condition is completely 
unrealistic, since neither the memory M nor the data D can exceed 2'^®. 

We now describe the effect of this sampling technique on the new tradeoff 
described in the previous subsection. The main difference between 
Heilman’s original attack on block ciphers and the modified attack on stream 
ciphers is that we use a smaller number t/D of tables, and force T to satisfy 
T > . Unlike the case of the BG attack, the preprocessing complexity remains 

unchanged as N / D, since we do not need any trial and error to pick the random 
startpoints, and simply wait for the special endpoints to occur randomly dur- 
ing our path evaluation. The total memory required to store the special points 
remains unchanged at M = mt/D. The total time T consists of evaluations 
of the fi functions but only t disk operations. We can thus conclude that the 
resultant time/memory/data tradeoff remains unchanged as TM'^D'^ = for 
T > , but we gain by reducing the number of expensive disk operations by a 

factor of t. Rivest’s sampling idea thus has no asymptotic effect on Hellman-like 
tradeoff curves for block and stream ciphers, but drastically changes the BG 
tradeoff curve for stream ciphers. 



5 Tradeoff Attacks on Stream Ciphers with Low 
Sampling Resistance 

The TM'^D'^ = N'^ tradeoff attack has feasible time, memory and data require- 
ments even for N = 2^°°. However, values of D > 2^® make each inversion attack 
very time consuming, since small values of T are not allowed by the T > 
condition, while large values of T do not benefit in practice from the Rivest 
sampling idea (since the T = evaluations of fi functions dominate the ^/T disk 
operations). 

At FSE 2000, Biryukov, Shamir and Wagner [3] introduced a different notion 
of sampling, which will be called BSW sampling. It was used in [3] to attack the 
specific stream cipher A5/1, but that paper did not analyse its general impact 
on the various tradeoff formulas. In this paper we show that by using BSW 
sampling, we can make the new tradeoff applicable with a larger 

choice of possible T values and a smaller number of disk operations. 

The basic idea behind BSW sampling is that in many stream ciphers, the 
state undergoes only a limited number of simple transformations before emitting 
its next output bit, and thus it is possible to enumerate all the special states 
which generate k zero bits for a small value of k without expensive trial and 
error (especially when each output bit is determined by few state bits). This is 
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almost always possible for fc = 1, but gets increasingly more difficult when we 
try to force a larger number of output bits to have specific values. The sampling 
resistance of a stream cipher is defined as R = 2~^ where k is the maximum 
value for which this direct enumeration is possible. Stream ciphers were never 
designed to resist this new kind of sampling, and their sampling resistance can 
serve as a new quantifiable design-sensitive security measure. In the case of A5/1, 
Biryukov Shamir and Wagner show that it is easy to directly enumerate the 2^® 
out of the 2®^ states whose outputs start with 16 zeroes, and thus the sampling 
resistance of A5/1 is at most 2“^®. Note that BSW sampling is not applicable 
at all to block ciphers, since their thorough mixing of keys and plaintexts makes 
it very difficult to enumerate without trial and error all the keys which lead to 
ciphertexts with a particular pattern of k bits during the encryption of some 
fixed plaintext. 

An obvious advantage of BSW sampling over Rivest sampling is that in the 
BG attack we can reduce the attack time T by a factor of R without increasing 
the preprocessing time P. We now describe how to apply the BSW sampling 
idea to the improved tradeoff attack TM^D'^ = N'^. 

Consider a stream cipher with N = 2^ states. Each state has a full name 
of n bits, and an output name which consists of the first n bits in its output 
sequence. If the cipher has sampling resistance R = 2“^, we can associate with 
each special state a short name oi n — k bits (which is used by the efficient 
enumeration procedure to define this special state), and a short output of 
n — k bits (which is the output name of the special state without the k leading 
zeroes). We can thus define a new random mapping over a reduced space of 
NR = 2"“^ points, where each point can be viewed as either a short name 
or a short output. The mapping from short names to short outputs is easy to 
evaluate (by expanding the short names of special states to full names, running 
the generator, and discarding the k leading zeroes), and its inversion is equivalent 
to the original cryptanalytic problem restricted to special states. 

We assume that DR > 1, and thus the available data contains at least one 
output which corresponds to some special state (if this is not the case we simply 
relax the definition of special states). We try to find the short name of any one 
of these DR special states by applying our TM'^D'^ = N'^ inversion attack to 
the reduced space with the modified parameters of DR and NR instead of D 
and N. The factor R^ is canceled out from the expression TM‘^{DR)'^ = (NR)"^, 
and thus the tradeoff relation remains unchanged. However, we gain in two other 
ways: 



1. The original range of allowed values of T was lower bounded by which 
could be problematic for large values of D. This lower bound is now reduced 
to {DRY , which can be as small as 1. This makes it possible to use a wider 
range of T parameters, and speed up actual attacks. 

2. The number of expensive disk operations is reduced from t to tR, since only 
the DR special points in the data have to be searched in the t/D matrices 
at a cost of one disk operation per matrix. This can greatly speed up attacks 
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with moderate values of t in which the t disk operations dominate the 
function evaluations. 

Table 1 summarizes the behavior of the three types of tradeoff attacks under 
the two types of sampling techniques discussed in this paper. It explains why 
BSW sampling can greatly reduce the time T, even though it has no effect on 
the asymptotic tradeoff relation itself. Only this type of sampling enabled [3] to 
attack A5/1 and find its 64 bit key in a few minutes of computation on a single 
PC using only 4,000 disk operations, given the data contained in the first two 
seconds of an encrypted GSM conversation. 



Sampling 

type 


BG attack 
on stream ciphers 


Heilman’s attack 
on block ciphers 


Our attack 
on stream ciphers 


Rivest 


new tradeoffs: 
TP = MD = N 
for 1 < T < D 
increased P 


unmodified tradeoff: 
TM^ = 
for 1 < T < V 
fewer disk operations 


unmodified tradeoff: 
TM^D'^ = 
ioi <T< N 

fewer disk operations 


BSW 


unmodified tradeoff: 
TM = N,l <T < D 


inapplicable to 
block ciphers 


unmodified tradeoff: 
TM^D^ = N^, wider 
range, {RD)^ < T < N 
even fewer disk operations 



Table 1. The effect of sampling on tradeoff attacks. 
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A The Sampling Resistance of Various Stream Cipher 
Constructions 

As we have seen in the main part of the paper low sampling resistance of a stream 
cipher allows for more flexible tradeoff attacks. In this appendix we briefly review 
several popular constructions and discuss their sampling resistance. 

A.l Non-linear Filter Generators 

In many proposed constructions a single linear feedback shift register (LFSR) is 
tapped in several locations, and a non-linear function / of these taps produces 
the output stream. Such stream ciphers are called non-linear filter generators, 
and the non-linear function is called a filter. The sampling resistance of such 
constructions depends on the location of the taps and on the properties of the 
function /. A crucial factor in determining the sampling resistance of such con- 
structions is how many bits of the function’s input must be fixed so that the 
function of the remaining bits is linear. 

Multiplexor is a boolean function, which takes s = log t -I- 1 bits of the output, 
and treats the first log t bits as an address of the bit in the next t bits. This bit 
becomes the output of the function. In order to linearize the output of the 
multiplexor one needs to fix only log t bits. Multiplexor is thus a weak function 
in terms of linearization. The actual sampling resistance of the multiplexor is 
influenced by the minimal distance between the address taps and the minimal 
distance from the address taps to the output tap. 

As a second example, consider the Alter function 

f{xi, ...,Xs)= g{xi, Xs-i) © Xs- 

If there is a gap of length I between tap Xs and the other taps Xi,. . . ,Xs-i, then 
the sampling resistance is at most 2“^, since by proper choice of the s — 1 bits we 
can linearize the output of the function /. Suppose that our aim is to efficiently 
enumerate all the states that produce a prefix of I zeroes. We can do this 
by setting the n — I non-gap bits to an arbitrary value, and then at each clock we 
choose the Xs bit in a way that zeroes the function / (assuming that feedback 
taps are not present in the gap of I bits). 



Sum of Products A sum of products is the following boolean function: Pick 
a set of disjoint pairs of variables from the stream cipher’s state: (xt^^jXi^), 
■ ■ • Define the Alter function as: 



s-l 

f{xi,...,Xs) = 

i=i 



A sum of products becomes a linear function if s/2 of its variables (one for each 
pair) are fixed. If these variables are all set equal to zero then / becomes the 
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constant function / = 0. We can thus expect this function to have a moderate 
resistance to sampling. The non-linear order of this function is only 2 and thus 
by controlling any pair we can create any desired value of the filter 

function. For example if the target pair is {xi^,Xi^) then the function / can be 
decomposed into: 

f{xi,...,Xs) = x,^Xi^ ®g{xi^,...,Xi^). 

At each step if the value of g is zero, the values of the target pair can be chosen 
arbitrarily out of (0, 0), (0, 1), (1, 0). If however g = 1 , then the value of the 
target pair must be (1, 1). Thus if the control pair is in a tap- less region of size 
21 with a gap I between the controlling taps, the sampling resistance of this 
cipher is at most 2“b 

As another example, suppose that a consecutive pair of bits is used as a target 
pair. It seems problematic to use a consecutive pair for product linearization, 
since sometimes we have to set both bits to 1. This is however not the case if we 
relax our requirements, and use output prefixes with non-consecutive bits forced 
to have particular values. For example, prefixes in which every second bit is set 
to zero (and with arbitrary bits in between) can be easily generated in this sum 
of adjacent products. 

Suppose now that in each pair the first element is from the first half of the 
register and the second element comes from the second half. Suppose also that 
the feedback function taps the most significant bit and some taps from the lower 
half of the register. In this case the sampling resistance is only 2“"/^. We set 
to arbitrary values the n/2 bits of the lower half of the register and guess the 
most significant tap bit. This way we know the input to the feedback function 
and linearize the output function. Forcing the output of the filter function at 
each step yields a linear equation (whose coefficients come from the lower half 
of the register and whose variables come from the upper half). After n/2 steps 
we have n/2 linear equations in n/2 variables which can be easily solved. This 
way we perform enumeration of all the states that produce the desired output. 

Moreover, if all pairs in the product are consecutive, then even a more inter- 
esting property holds. We can linearize the function just by fixing a subset of 
n/2 even (or odd) bits of the register, and thus linearization is preserved even 
after shifting the register (with possible interference of the feedback function). 

A. 2 Shrinking and Self-Shrinking Generators 

The shrinking generator is a simple construction suggested by [1] which is not 
based on the filter idea. This generator uses two regularly clocked LFSRs and the 
output of the first one decides whether the output of the second will appear in 
the output stream or will be discarded. This generator has good statistical prop- 
erties like long periods and high linear complexity. A year later a self-shrinking 
generator (which used one LFSR clocked twice) was proposed by [6]. The out- 
put of the LFSR is determined by a pair of most significant bits a„_i, a„ of the 
LFSR state: If a„_i = I the output is a„, and if a„_i = 0 there is no output 
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in this clock cycle. This construction has the following sampling algorithm: pick 
arbitrary value for n/2 decision bits, and for each pair with a decision bit equal 
to 1 set the corresponding output bit to 0. If the decision bit is 0 then we have 
freedom of choice and we enumerate both possibilities. The sampling resistance 
of this construction is thus 2“”/^. 
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Abstract. At Asiacrypt ’99, Sun, Yang and Laih proposed three RSA 
variants with short secret exponent that resisted all known attacks, in- 
cluding the recent Boneh-Durfee attack from Eurocrypt ’99 that im- 
proved Wiener’s attack on RSA with short secret exponent. The resis- 
tance comes from the use of unbalanced primes p and q. In this paper, we 
extend the Boneh-Durfee attack to break two out of the three proposed 
variants. While the Boneh-Durfee attack was based on Coppersmith’s 
lattice-based technique for finding small roots to bivariate modular poly- 
nomial equations, our attack is based on its generalization to trivariate 
modular polynomial equations. The attack is heuristic but works well 
in practice, as the Boneh-Durfee attack. In particular, we were able to 
break in a few minutes the numerical examples proposed by Sun, Yang 
and Laih. The results illustrate once again the fact that one should be 
very cautious when using short secret exponent with RSA. 



1 Introduction 

The RSA [13] cryptosystem is the most widely used public-key cryptosystem. 
However, RSA is computationally expensive, as it requires exponentiations mod- 
ulo N , where is a large integer (at least 1024 bits due to recent progress in 
integer factorization [4]) product of two primes p and q. Consequently, speeding 
up RSA has been a stimulating area of research since the invention of RSA. 
Perhaps the simplest method to speed up RSA consists of shortening the expo- 
nents of the modular exponentiations. If e is the RSA public exponent and d 
is the RSA secret exponent, one can either choose a small e or a small d. The 
choice of a small d is especially interesting when the device performing secret 
operations (signature generation or decryption) has limited computed power, 
such as smartcards. Unfortunately, Wiener [20] showed over 10 years ago that if 
d < then one could (easily) recover d (and hence, the secret primes p and 

q) in polynomial time from e and N using the continued fractions algorithm. 

T. Okamoto (Ed.): ASIACRYPT 2000, LNCS 1976, pp. 14-29, 2000. 

@ Springer- Verlag Berlin Heidelberg 2000 
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Verheul and van Tilborg [19] slightly improved the bound in 1997, by showing 
that Wiener’s attack could be applied to larger d, provided an exhaustive search 
on about 2 log 2 ((i/-/V°'^^) bits. At Eurocrypt ’99, Boneh and Durfee [3] presented 
the hrst substantial improvement over Wiener’s bound. Their attack can (heuris- 
tically) recover p and q in polynomial time if d < 7v0-292^ attack is heuristic 
because it is based on the seminal lattice-based work by Coppersmith [5] on 
hnding small roots to low-degree modular polynomial equations, in the bivari- 
ate case.^ However, it should be emphasized that the attack works very well in 
practice. 

At Asiacrypt ’99, Sun, Yang and Laih [18] noticed that all those attacks on 
RSA with short secret exponent required some (natural) assumptions on the 
public modulus N. For instance, the Wiener’s bound only holds if p + q = 
0{\/N), and e is not too large. Similar restrictions apply to the extension to 
Wiener’s attack by Verheul- van Tilborg [19], and to the Boneh-Durfee attack [3]. 
This led Sun, Yang and Laih to propose in [18] simple variants of RSA using a 
short secret exponent that, a priori, foiled all such attacks due to the previous 
restrictions. More precisely, they proposed three RSA schemes, in which only the 
(usual) RSA key generation is modihed. In the hrst scheme, one chooses p and q 
of greatly different size, and a small exponent d in such a way that the previous 
attacks cannot apply. In particular, d can even be smaller than if p and q 

are unbalanced enough. The second scheme consists of a tricky construction that 
selects slightly unbalanced p and q in such a way that both e and d are small, 
roughly around \/N. The third scheme is a mix of the hrst two schemes, which 
allows a trade-oh between the sizes of e and d. Sakai, Morii and Kasahara [14] 
earlier proposed a different key generation scheme which achieves similar results 
to the third scheme, but that scheme can easily been shown insecure (see [18]). 

In this paper, we show that the hrst and third schemes of [18] are insecure, 
by extending the Boneh-Durfee attack. Our attack can also break the second 
scheme, but only if the parameters are carelessly chosen. Boneh and Durfee 
reduced the problem of recovering the factors p and q to hnding small roots 
of a particular bivariate modular polynomial equation derived from the basic 
equation ed = 1 (mod(j){N)). Next, they applied an optimized version (for that 
particular equation) of Coppersmith’s generic technique [5] for such problems. 
However, when p and q are unbalanced, the particular equation used by Boneh 
and Durfee is not enough, because it has no longer any “small” root. Our attack 
extends the Boneh-Durfee method by taking into account the equation N = pq. 
We work with a system of two modular equations with three unknowns; interest- 
ingly, when p and q are imbalanced, this approach leads to an attack on systems 
with d even larger than the fV‘^'292 q£ Boneh and Durfee. The attack is 

extremely efficient in practice: for typical instances of two of the schemes of [18], 
this approach breaks the schemes within several minutes. Also, our “triviariate” 
version of Coppersmith’s technique we use may be of independent interest. 

^ The bivariate case is only heuristic for now, as opposed to the (simpler) univari- 
ate case, for which the method can be proved rigorously. For more information, 
see [5,2,12]. 
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The remainder of this paper is organized as follows. In Section 2, we briefly 
review former attacks on RSA with short secret exponents, recalling necessary 
background on lattice theory and Coppersmith’s method to And small roots of 
low-degree modular polynomial equations. This is useful to explain our attacks. 
In Section 3, we describe the RSA schemes with short secret exponent of [18]. In 
Section 4, we present the new attack using the trivariate approach. We discuss 
an implementation of the attack and its running time on typical instances of the 
RSA variants in Section 5. 

2 Former Attacks on RSA with Short Secret Exponent 

All known attacks on RSA with short secret exponent focus on the equation 
ed=l mod (where 4>{N) = N — {p + q) + 1) rewritten as: 

ed = I + k ^ (1) 

where k is an unknown integer and s = (p -I- q)/2. The primes p and q can be 
recovered from either d or s. Note that k and d are coprime. 

2.1 The Wiener Attack 

Wiener’s attack [20] is based on the continued fractions algorithm. Recall that 
if two (unknown) coprime integers A and B satisfy \x — ^\ < 2 ^ where a; is a 
known rational, then can be obtained in polynomial time as a convergent of 
the continued fraction expansion of x. Here, (1) implies that 

2e k \2 + k{l-2s)\ 

N ~ d ~ m ■ 

Therefore, if — 2 ^ ^ recovered in polynomial time from e and 

N, as k/d is a convergent of the continued fraction expansion of 2e/N. That 
condition can roughly be simplified to ksd = 0{N), and is therefore satisfied if 
k, s and d are all sufficiently small. In the usual RSA key generation, s = 0{\fN) 
and k = 0{d), which leads to the approximate condition d = But the 

condition gets worse if p and q are unbalanced, making s much larger than y/N. 
For instance, if p = the condition becomes d = 

The extension of Wiener’s attack by Verheul and van Tilborg [19] applies to 
d > provided exhaustive search on 0 (log 2 (d/A°-^®)) bits if p and q are 

balanced. Naturally, the attack requires much more exhaustive search if p and q 
are unbalanced. 

2.2 The Boneh-Durfee Attack 

The Small Inverse Problem. The Boneh-Durfee attack [3] looks at the equa- 
tion (1) modulo e: 




-k 



= 1 (mode). 



(2) 
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Assume that the usual RSA key generation is used, so that |s| < yje and \k\ < d 
(ignoring small constants). The problem of finding such a small root (s, k) of that 
bivariate modular equation was called the small inverse problem in [3], since one 
is looking for a number {N + l)/2 — s close to {N + l)/2 such that its inverse 
—k modulo e is rather small. Note that heuristically, the small inverse problem 
is expected to have a unique solution whenever |A:| < c? < This led Boneh 

and Durfee to conjecture that RSA with d < is insecure. 

Coppersmith [5] devised a general lattice-based technique to find sufficiently 
small roots of low-degree modular polynomial equations, which we will review 
in the next subsections, as it is the core of our attacks. By optimizing that 
technique to the specific polynomial of (2), Boneh and Durfee showed that one 
could solve the small inverse problem (and hence, break RSA) when d < _/v0-292^ 
This bound corresponds to the usual case of balanced p and q. It gets worse as 
p and q are unbalanced (see [3,18]), because s becomes larger. 

Lattice Theory. Coppersmith’s technique, like many public-key cryptanalyses, 
is based on lattice basis reduction. We only review what is strictly necessary for 
this paper. Additional information on lattice theory can be found in numerous 
textbooks, such as [6,17]. For the important topic of lattice-based cryptanalysis, 
we refer to the recent survey [12]. 

We will call lattice any subgroup of some (Z”, -k), which corresponds to the 
case of integer lattices in the literature. Consequently, for any integer vectors 
bi,...,br, the set L(bi,...,br) = {X)i=i | G Z} of all integer linear 
combinations of the bj’s is a lattice, called the lattice spanned by the b^’s. In 
fact, all lattices are of that form. When L = L(bi,...,br) and the b^’s are 
further linearly independent (over Z), then (bi,...,br) is called a basis of L. 
Any lattice L has infinitely many bases. However, any two bases share some 
things in common, notably the number of elements r and the Gram determinant 
deti<ij<r(bi, bj) (where (, ) denotes the Euclidean dot product). The parameter 
r is called the lattice dimension (or rank), while the square root of the Gram 
determinant is the lattice volume (or determinant), denoted by vol(T). The name 
volume comes from the fact that the volume matches the r-dimensional volume of 
the parallelepiped spanned by the b^’s. In the important case of full-dimensional 
lattices (r equal to n), the volume is also the absolute value of the determinant of 
any basis (hence the name determinant). In general, it is hard to give a “simple” 
expression for the lattice volume, and one contents oneself with the Hadamard’s 
inequality to estimate the volume: 

r 

vol(L) < J]^||b,l|. 

i=l 

Fortunately, sometimes, the lattice is full-dimensional and we know a specific 
basis which is triangular, making the volume easy to compute. 

The volume is important because it enables one to estimate the size of 
short lattice vectors. A well-known result by Minkowski shows that in any r- 
dimensional lattice L, there exists a non-zero x G L such that j|xj| < ^/r ■ 
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vol(L)^/’’, where ||.|| denotes the Euclidean norm. That bound is in some (nat- 
ural) sense the best possible. The LLL algorithm [9] can be viewed, from a 
qualitative point of view, as a constructive version of Minkowski’s result. Given 
any basis of some lattice L, the LLL algorithm outputs in polynomial time a 
so-called LLL-reduced basis of L. The exact definition of an LLL-reduced basis 
is beyond the scope of this paper, we only mention the properties that are of 
interest here: 

Fact 1. Any LLL-reduced basis (bi, . . . ,br) of a lattice L in Z” satisfies: 
llbill < 2’'/2 vo1(L)1/’- and ||b 2 || < 



Coppersmith’s Technique. For a discussion and a general exposition of Cop- 
persmith’s technique [5], see the recent surveys [2,12]. We describe the tech- 
nique in the bivariate case, following a simplified approach due to Howgrave- 
Graham [7]. 

Let e be a large integer of possibly unknown factorization. Assume that 
one would like to find all small roots of f{x,y) = 0 (mode), where f{x,y) 
is an integer bivariate polynomial with at least one monomial of maximal total 
degree which is monic. If one could obtain two algebraically independent integral 
bivariate polynomial equations satisfied by all sufficiently small modular roots 
(x,y), then one could compute (by resultant) a univariate integral polynomial 
equation satisfied by x, and hence find efficiently all small (x,y). Coppersmith’s 
method tries to obtain such equations from reasonably short vectors in a certain 
lattice. The lattice comes from the linearization of a set of equations of the form 
x^y'’ f{x,y)'^ = 0 (mode“) for appropriate integral values of u, v and w. Such 
equations are satisfied by any solution of f{x,y) = 0 (mode). Small solutions 
(xo^yo) give rise to unusually short solutions to the resulting linear system, 
hence short vectors in the lattice. To transform modular equations into integer 
equations, one uses the following elementary lemma, with the (natural) notation 

\\Hx,y)\\ = alj for h{x,y) = a^jx^y^ : 

Lemma 2. Let h{x,y) G Ij[x,y] he a polynomial which is a sum of at most 
r monomials. Suppose that h{xo,yo) = 0 mod e™ for some positive integer m 
where jxoj < X and |?/o| < md \\h{xX,yY)\\ < e™/-y/r. Then h{xo,yo) = 0 
holds over the integers. 

Now the trick is to, given a parameter m, consider the polynomials 
hui.u2,v{x,y) = e'^~'"x'^^y^^f{x,yy. 

where ui, U 2 and v are integers. Notice that any root (xo,yo) of f{x,y) mod- 
ulo e is a root modulo e™ of hui,u 2 ,v(x,y), and therefore, of any integer linear 
combination h{x,y) of the hui,u 2 ,v{x,yYs. If such a combination h{x,y) further 
satisfies ||/i(a:A, yy)|| < jy/r, where r is the number of monomials of h, then 
by Lemma 2, the integer equation h{x,y) = 0 is satisfied by all sufficiently 
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small modular roots of h modulo e. Thus, it suffices to find two algebraically 
independent such equations h\{x,y) and h 2 {x,y). 

The use of integer linear combination suggests that we represent the poly- 
nomials as vectors in a lattice, so that finding polynomials with small norm 
reduces to finding short vectors in a lattice. More precisely, let 5 be a set of 
indices (ui,U 2 ,v), and choose a representation of the polynomials hui,u 2 ,v(x,y) 
with (ui,U 2 ,v) G S as n-dimensional integer vectors for some n. Let L be the 
lattice in Z” spanned by the vectors corresponding to hui,u 2 ,v(xX,yY) with 
(ui,U 2 ,v) G S. Apply the LLL algorithm on the lattice, and let h\{xX,yY) and 
h 2 {xX,yY) be the polynomials corresponding to the first two vectors of the re- 
duced basis obtained. Denoting by r the dimension of L, one deduces from the 
LLL theoretical bounds that: 

\\hi{xX,yY)\\ < 2'’/2vo 1(L)1/’' and \\h 2 {xX,yY)\\ < 

To apply Lemma 2, we want both of these upper bounds to be less than e™/ -y/n; 
since the factor 2'’ is negligible with respect to e™, this amounts to saying 

vol(L) < (3) 

There are two problems. The first problem is that even if this condition is satis- 
fied, so that Lemma 2 applies, we are not guaranteed that the integer equations 
h\{x,y) = 0 and h 2 {x,y) = 0 obtained are algebraically independent. In other 
words, /i 2 will provide no additional information beyond hi if the two linearly 
independent short basis vectors do not also yield algebraically independent equa- 
tions. It is still an open problem to state precisely when this can be guaranteed, 
although all experiments to date suggest this is an accurate heuristic assumption 
to make when inequality (3) holds. We note that a similar assumption is used 
in the work of Bleichenbacher [1] and Jutla [8]. 

The second problem is more down-to-earth: how can we make sure that vol(L) 
is small enough to satisfy inequality (3) ? Note that Hadamard’s bound is un- 
likely to be useful. Indeed, in general, some of the coefficients of f{x, y) are about 
the size of e, so that \\hui^u 2 ,v{xX ,yY)\\ is at least e"*. To address this problem, 
one must choose in a clever way the set of indices S to have a close estimate on 
vol(L). The simplest solution is to choose S so that L is full-dimensional (r equal 
to n) and the hui,u 2 ,v{xX,yYYs form a triangular matrix for some ordering on 
the polynomials and on the monomials (the vector coordinates). Since we want 
vol(L) to be small, each coefficient on the diagonal should be the smallest one 
of hu-i,u 2 ,vixX,yY) = e"^~'"{xX)'^^{yY)'^^f{xX,yY)'", which is likely to be the 
one corresponding to the monic monomial of maximal total degree of f(x,y). 

In the general case, f{x,y) may have several monomials of maximal total 
degree, and the only simple choice of S is to cover all the monomials of total 
degree less than some parametrized bound. More precisely, if A is the total 
degree of f{x,y), and x°'y‘^~°' is a monic monomial of f{x,y), one defines S as 
the set of (ui,U 2 , v) such that mi -I- M 2 + Av < hA and u\,U 2 ,v > 0 with ui < a 
or U 2 < A — a. Then the volume of the corresponding lattice can be computed 
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exactly, and it turns out that (3) is satisfied whenever XY < for and m 

is sufficiently large. 

However, depending on the shape of f{x,y) (represent each monomial x^'y^ 
by the point (i,j)), other choices of S might lead to improved bounds. Boneh and 
Durfee applied such tricks to the polynomial (2). In [3], they discussed several 
choices of S. Using certain sets S for which the lattice is full-dimensional and 
one knows a triangular lattice basis, they obtained a first bound d < A^O-284 
for their attack. Next, they showed that using a slightly different S for which 
the lattice is no longer full-dimensional, one ends up with the improved bound 
d < The latter choice of S is much harder to analyze. For more details, 

see [3]. 

3 The Sun-Yang-Laih RSA Key Generation Schemes 

3.1 Scheme (I) 

The first scheme corresponds to a simple unbalanced RSA [15] in which the 
parameters are chosen to foil previously known attacks: 

1. Select two random primes p < q such that both p and N = pq are suf- 
ficiently large to foil factorization algorithms such as ECM and NFS. The 
more unbalanced p and q are, the smaller d can be. 

2. Randomly select the secret exponent d such that log2 d + log2P > | log2 N 
and d > 2'>'y^, where 7 is the security parameter (larger than 64). 

3. If the public exponent e defined by ed = 1 (mod^(fV)) is not larger than 
(j){N)/2, one restarts the previous step. 

A choice of parameters suggested by the authors is: p is a 256-bit prime, g is a 
768-bit prime, d is a 192-bit number. Note that 192 is far below Wiener’s bound 
(256 bits) and Boneh-Durfee’s bound (299 bits). 

3.2 Scheme (II) 

The second scheme selects one of the primes in such a way that one can select e 
and d to be small at the same time: 

1. Fix the bit-length of N. 

2. Select a random prime p of | log2 N — 112 bits, and a random k of 112 bits. 

3. Select a random d of | log2 A -|- 56 bits coprime with k{p — 1). 

4. Compute the two Bezout integers u and v such that du — k{p — l)u = 1, 
0 < u < k{p — 1) and 0 < v < d. 

5. Return to Step 3 if u -I- 1 is not coprime with d. 

6. Select a random h of 56 bits until q = v + hd+ 1 is prime. 

The RSA parameters are p, q, e = u+hk{p— 1), d and N = pq. Notice that e and 
d satisfy the equation ed = 1 -I- k(j){N). They both have approximate bit-length 
i log2 N + 56. The primes p and q have approximate bit-length i log2 N — 112 
and i log2 N + 112 respectively. 

A possible choice of parameters for Scheme (II) might be: p a 400-bit prime, 
q a 624-bit prime, and e and d are each 568 bits integers. 
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3.3 Scheme (III) 

The third scheme is a mix of the first two schemes, allowing a trade-off between e 
and d such that log 2 e-|-log 2 d « log 2 N+£k where £k is a predetermined constant. 
More precisely, the scheme is a parametrized version of scheme II: p, k, d and h 
have respective bit-length £p (less than | log 2 N), £k, £d, and log 2 N — £p — £ 4 . 
To resist various attacks, the following is required: 

1 . £k-»£p-£d + l- 

2. 4a(2/3-ha-l) > 3(l-/3-a)2, where a = and /3 = ■ 

3. k must withstand an exhaustive search and £k+ £p > 5 log 2 N. 

A choice of parameters suggested by the authors is: p is a 256-bit prime, q 
is a 768-bit prime, e is an 880-bit number, and d is a 256-bit number. 



4 The Attack Algorithm 



In this section we demonstrate how to launch an attack on Schemes (I) and (III) . 
The approach used here closely follows that taken by Boneh and Durfee [3] , but 
differs in several crucial ways to allow it to work when the factors p and q of the 
public modulus N are unbalanced. Interestingly, our attack gets better (works 
for larger and larger d) the more unbalanced the factors of the modulus become. 

Recall the RSA equation 



ed = 1 + k 







We note that the Boneh-Durfee approach treats this as an equation modulo e 
with two “small” unknowns, k and s = (p+q) /2. This approach no longer works if 
p and q are unbalanced, since a good bound on s can no longer be established. For 
this reason, the authors of the schemes from Section 3 hoped that these schemes 
would resist the lattice-based cryptanalysis outlined in Section 2.2. However, we 
will see that a more careful analysis of the RSA equation, namely one that does 
not treat p+q as a, single unknown quantity but instead leaves p and q separately 
as unknowns, leads to a successful attack against two of these schemes. 

Writing A = N + 1, the RSA equation implies 



2 -I- k{A — p — q) = 0 (mod e). 



The critical improvement of our attack is to view this as a modular equation 
with three unknowns, k,p,q, with the special property that the product pq of 
two of them is the know74n quantity N. We may view this problem as follows: 
given a polynomial f{x, y, z) = x{A + y + z) — 2, find (xq, yo, ^ 0 ) satisfying: 

f(xo,yo,zo) = 0 (mode). 



where 



koKA \yo\<Y, \zo\ < Z, andyoZo = N. 
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Note that the bounds X « ed/N, Y ^ p, and Z k, q can be estimated to within 
a power of 2 based on the security parameters chosen for the scheme. 

Following Coppersmith’s method, our approach is to pick r equations of 
the form f"{x,y,z) and to search for low-norm integer linear 

combinations of these polynomials. The basic idea is to start with a handful 
of equations of the form z) for j = 0, ... for some integers a 

and t with t > 0. Knowing N = pq allows us to replace all occurrences of the 
monomial yz with the constant N, reducing the number of variables in each of 
these equations to approximately m? instead of the expected We will refer 
to these as the primary polynomials. 

Since there are only t -|- 1 of these equations, this will result in a lattice 
that is less than full rank; we therefore include some additional equations to 
bring the lattice to full rank in order to compute its determinant. We refer 
to these as the helper polynomials. We have a great deal of choice in picking 
the helper polynomials; naturally, some choices are better than others, and it 
is generally a tedious but straightforward optimization problem to choose the 
primary and helper polynomials that are optimal. The equations we work with 
are the following. Fix an integer m, and let a and t > 0 be integers which we 
will optimize later. We define 

• gk,i,b{x,y,z) := e'^~^x'^y°-z^f^{x,y,z), for k = 0..(m - 1), i = l..(m - fc), 

and 6=0,1; and, 

• hkj{x, y, z) := f’^{x, y, z), for k = 0..m and j = 0..t. 

The primary polynomials are hm,j{x, y, z) for j = 0, . . . , t, and the rest are the 
helper polynomials. Following Coppersmith’s technique, we form a lattice L by 
representing gk,i^b{xX, yY, zZ) and hk,j{xX, yY, zZ) by their coefficients vectors, 
and use LLL to find low-norm integer linear combinations h\{xX,yY, zZ) and 
h 2 {xX,yY, zZ). The polynomials hi{x,y,z) and h 2 {x,y,z) have {k,p,q) as a 
root over the integers; to remove z as an unknown, we use the equality z = N /y, 
obtaining Hi{x,y) and H 2 {x,y) which have (k,p) as a solution. Taking the 
resultant ReSx{Hi{x,y), H 2 {x,y)) yields a polynomial H{y) which has p as a 
root. Using standard root-finding techniques allows us to recover the factor p of 
N efficiently, completing the attack. 

The running time of this algorithm is dominated by the time to run LLL on 
the lattice L, which has dimension (m -|- l)(m -I- t -I- 1). So it would be ideal to 
keep the parameters m and t as low as possible, limiting to a reasonable number 
the polynomials used to construct L. Surprisingly, the attack is successful even 
if only a handful of polynomials are used. The example given by the original 
authors for schemes (I) succumbs easily to this attack with m = 3 and t = 1; 
with these parameters, our attack generates 20 polynomials. Scheme (III) can 
be cryptanalyzed with parameters m = 2 and t = 2, yielding 15 polynomials. 
This gives lattices of dimension 20 (see Figure 1) and 15, respectively, which 
can be reduced via the LLL algorithm within a matter of seconds on a desktop 
computer. We discuss our implementation and the results of our experiments 
more in Section 5. 




Cryptanalysis of the RSA Schemes with Short Secret Exponent 



23 



4.1 Analysis of the Attack 

In order to be sure that LLL returns vectors that are “short enough” to use 
Lemma 2, we must derive sufficiently small bounds on the determinant of the 
lattice L formed from the polynomials gk,i,b{xX,yY, zZ) and hkj{xX,yY, zZ). 
Fortunately, this choice of polynomials makes the computation of the determi- 
nant of L fairly straightforward, if somewhat tedious. We provide the details in 
the appendix. 



Representing the Lattice as a Triangular Matrix. In order to compute the 
volume of the lattice L, we would like to list the polynomials gk,i,b{xX,yY, zZ) 
and hkj{xX,yY, zZ) in a way that yields a triangular matrix. There is an or- 
dering on these polynomials that leads to such a representation: we first list the 
gk,i,b{xX, yY, zZ) indexed outermost by A: = 0, . . . , m — 1, then t = 0, . . . , fc, 
then innermost by 6 = 0, 1. We then list hkj{xX, yY, zZ) indexed outermost by 
k = 0, . . . , m then j = 0, ... ,t. (See Figure 1 for the case of to = 2, t = 1, 
a = 1.) Each new polynomial introduces exactly one new monomial or 

Note that no monomial involving the product yz appears, since yz can 
be eliminated^ using the identity N = yz. 

The determinant of this matrix is simply the product of the entries on the 
diagonal, which for to = 3, t = 1, a = 1 is 

vol(L) = det(M) = (- 4 ^ 

We expect the LLL algorithm to return vectors short enough to use Lemma 2 
when 



VOl(L) = e^°A4°y34^4 ^ gmr ^ ^60^ 

The example given by the original authors for Scheme (I) is to use p of 256 bits, 
q of 768 bits, d of 256 bits, and e of 1024 bits. This gives bounds 

A « ed/N « e^/^, Y « e^/^, and Z « 
we may then confirm 

det(M) = e^°X^°Y^^Z^ « e®® < = e”"”, 

so Lemma 2 applies.^ Therefore, when we run the LLL algorithm on this lattice, 
we will get two short vectors corresponding to polynomials hi{x, y, z), h 2 (x, y, z); 
by the bound on the determinant, we know that these polynomials will have 

^ Caution must be taken to ensure the polynomials remain monic in the terms 
and of highest degree; if the substitution yz N causes a coefficient of such 

a term to be different from 1 , then we multiply the polynomial by N~^ mod e*” (and 
reduce mod e*” as appropriate) before continuing. 

® The reader may have noticed that we have suppressed the error term associated with 
the execution of the LLL algorithm. Interestingly, even if the LLL “fudge factor” is 
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Fig. 1. Example of the lattice formed by the vectors gk,i,b{xX,yY, zZ) and 
hk,j{xX,yY, zZ) when m = 2, t = 1, and a = 1. The matrix is lower trian- 
gular. Entries marked with indicate off-diagonal quantities whose values do 
not affect the determinant calculation. The polynomials used are listed on the 
left, and the monomials they introduce are listed across the top. The double line 
break occurs between the gk,i,b and the hkj, while the single line breaks occur be- 
tween increments ofk. The last single line break separates the helper polynomials 
(top) from the two primary polynomials (bottom). 



norm that is low enough to use Lemma 2. Therefore these polynomials will have 
{k,p,q) as a solution over the integers. To turn these into bivariate equations, 
we use the equality 2 = N/y to get Hi{x,y) and H 2 {x,y) which have (k,p) as a 
solution over the integers. We then take the resultant ReSx{Hi{x,y), H 2 {x,y)) 
to obtain a univariate polynomial H{y) that has p as a root. 

More generally, if we pick optimal values for t and a take m sufficiently large, 
our attack will be successful for even larger bounds on d. The highest possible 
bound on d for which our attack can work depends on the parameters chosen for 
the scheme. Suppose the parameter d « is used. The table below summarizes 

taken into account, this bound is still good enough. We require 
vol(L) < = 2"°°e®® < e®®+s < 

Slightly larger parameters m and t are required to rigorously obtain the bound for 
norm of the second basis vector, although in practice the LLL algorithm works well 
enough so that the parameters chosen here are sufficient. 
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the largest possible S for which our attack can succeed. We point out the choices 
of parameters that give rise to the schemes of Section 3. 



logjv(e) 



logiv(p) 



Fig. 2. Largest 
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For example, with the example for Scheme (I), where e « fV and p « 
our attack will be successful not only for the 6 = 0.188 suggested, but all the 
way up to i5 < 0.364 (assuming a large enough m is used.) Similarly, our attack 
works in Scheme (III) up to d < Notice that our attack comes close to, 

but cannot quite reach, the d < required to break Scheme (II). 

4.2 Comparison with the Bivariate Approach 

Alternatively, one can consider the system of two modular equations with three 
unknowns as a single bivariate equation by incorporating the equation N = 
pq into the main tri variate equation. This was independently noticed by Willi 
Meier [11], who also addressed the problem of breaking Schemes (I) and (III), 
using a bivariate approach rather than our trivariate approach. One then obtains 
an equation of the form f{x, y) = x^y + Axy + Bx + Cy modulo e, where the 
unknowns are k and the smallest prime among p and q. 

However, it turns out that the application of Coppersmith’s technique to this 
particular bivariate equation yields worse bounds than with the trivariate ap- 
proach previously described. For example, the bivariate approach allows one to 
break scheme (I) as long as d < (and perhaps slightly higher, if sublattices 

are considered as in [3]), but fails for larger d. One can view the bivariate ap- 
proach a special case of our trivariate approach, in which one degree of freedom 
for optimization has been removed. One then sees that the bivariate approach 
constrains the choice of primary and helper polynomials in a suboptimal way, 
resulting in worse bounds on d. 



5 Implementation 

We implemented this attack using Victor Shoup’s Number Theory Library [16] 
and the Maple Analytical Computation System [10]. The attack runs very ef- 
ficiently, and in all instances of Schemes (I) and (III) we tested, it produced 




26 



Glenn Durfee and Phong Q. Nguyen 



algebraically independent polynomials Hi{x,y) and H 2 {x,y). These yielded a 
resultant H{y) = (y — p)Ho{y), where Ho{y) is irreducible, exposing the factor p 
of N in every instance. This strongly suggests that this “heuristic” assumption 
needed to complete the multivariate modular version of Coppersmith’s technique 
is extremely reliable, and we conjecture that it always holds for suitably bounded 
lattices of this form. The running times of our attacks are given below. 

Scheme size of n size of p size of e size of d m t a lattice rank running time 
I 1024 256 1024 192 3 1 1 20 40 seconds 

III 1024 256 880 256 2 2 0 15 9 seconds 

These tests were run on a 500MHz Pentium III running Solaris. 

6 Conclusions and Open Problems 

We showed that unbalanced RSA [15] actually improves the attacks on short 
secret exponent by allowing larger exponent. This enabled us to break most of 
the RSA schemes [18] with short secret exponent from Asiacrypt ’99. The attack 
extends the Boneh-Durfee attack [3] by using a “trivariate” version of Copper- 
smith’s lattice-based technique for finding small roots of low-degree modular 
polynomial equations. Unfortunately, despite experimental evidence, the attack 
is for now only heuristic, as the Boneh-Durfee attack. It is becoming increas- 
ingly important to find sufficient conditions for which Coppersmith’s technique 
on multivariate modular polynomials can be proved. 

Our results illustrate once again the fact that one should be very cautious 
when using RSA with short secret exponent. To date, the best method to enjoy 
the computational advantage of short secret exponent is the following counter- 
measure proposed by Wiener [20]. When N = pq, the idea is to use a private 
exponent d such that both dp = d mod (p — 1) and dq = d mod {q — 1) are small. 
Such a d speeds up RSA signature generation since RSA signatures are often 
generated modulo p and q separately and then combined using the Chinese Re- 
mainder Theorem. Classical attacks do not work since d is likely to be close to 
4>{N). It is an open problem whether there is an efficient attack on such secret 
exponents. The best known attack runs in time min(y^^, ^Jd^). 
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A General Calculation of the Determinant 

The general formula for the determinant of the lattice we build in Section 4 is 
vol(L) = det(M) = 



where 



Ce = Cx = -m{m + 1)(4to + 3t + 5), 



{ g(m^ + 3(a + t + l)m^ + (3t^ + 6at + 3a^ + 6a + 6t + 2)m 

+(3t^ + 6at + 3a^ + 4a + 3t — a^)) if a > 0, 

+ 3(a + t + l)m^ + (3t^ + 6at + 3a^ + 6a + 6t + 2)m 

-t“(3t^ -f 6at -f 3a^ -t- 3a -t- 3t)) if a < 0, 



C. 



g(m^ — 3(a — 1 )to^ + (3a^ — 6a + 2)m + (3a^ — 2a — a^)) if a > 0, 
i(m^ — 3(a — 1 )to^ + (3a^ — 6a + 2)m + (3a^ — 3a)) if a < 0. 



We need det(M) < e'"’’ = In order to optimize the choice of t 

and a, we write t = rm and a = am, and observe 



Ce 

Cy 

c. 



Cx = ^(3r + 4)m^ + o(m^), 

6 

J |(3r^ + 6ar + 3a^ + 3a + 3r + 1 — a^)m^ + o{m^) if a > 0, 

\ |(3r^ + 6ar + 3a^ + 3a + 3r + l)m^ + o{m^) if a < 0, 

J g(3a^ — 3a + 1 — a^)m^ + o{m^) if a > 0, 

1 g(3a^ — 3a + l)m^ + o{m^) if a < 0. 



Suppose we write e = N'^, d = , and X = N^, so F = ^ . Then X = 

jY£(5-i^ So the requirement on det(M) now becomes 

^£Ce + (£(5— l)Cx+/3Cy + (l — /3)Cz ^ ^m(m+l)(m+t+l) _ ^£(T+l)m^+o(m^) 

The above expression holds (for large enough m) when 

sCe + (£ii — C)Cx + fdCy + (1 — l3)Cz — (t + 1) < 0. (5) 

The left-hand-side of this expression achieves its minimum at 

To = (2ao/3 - P-S + l)/(2/3), 

_(1-P-{1-P-S + /?2)(i/2) if(3cS, 
1(/3-<5)/(2/3-2) if/3>F 



Using T = To and a = ao will give us the minimum value on the left-hand-side of 
inequality 5, affording us the largest possible X to give an attack on the largest 
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possible d < . The entries in Figure 2 were generated by plugging in tq and 

ao and solving for equality in Equation 5. 

It is interesting to note that formulation of the root-finding problem for RSA 
as a trivariate equation is strictly more powerful than its formulation as the 
small inverse problem. This is because the small inverse problem is not expected 
to have a unique solution once S > 0.5, while our attack works in many cases 
with (5 > 0.5. We note that when e = 1 and /3 = 0.5 - as in standard RSA - our 
attack gives identical results to simpler Boneh-Durfee attack (d < Their 

optimization of using lattices of less than full rank to achieve the d < 
bound should also work with our approach, but we have not analyzed how much 
of an improvement it will provide. 
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Abstract. We present an attack on plain ElGamal and plain RSA en- 
cryption. The attack shows that without proper preprocessing of the 
plaintexts, both ElGamal and RSA encryption are fundamentally inse- 
cure. Namely, when one uses these systems to encrypt a (short) secret 
key of a symmetric cipher it is often possible to recover the secret key 
from the ciphertext. Our results demonstrate that preprocessing mes- 
sages prior to encryption is an essential part of both systems. 



1 Introduction 

In the literature we often see a description of RSA encryption as C = (M®) mod 
N (the public key is (N,e)) and a description of ElGamal encryption as C = 
{My'^, g'^) mod p (the public key is (p,g,y))- Similar descriptions are also given 
in the original papers [17,9]. It has been known for many years that this simplified 
description of RSA does not satisfy basic security notions, such as semantic se- 
curity (see [6] for a survey of attacks) . Similarly, a version of ElGamal commonly 
used in practice does not satisfy basic security notions (even under the Decision 
Diffie-Hellman assumption [5] ) ^ . To obtain secure systems using RSA and ElGa- 
mal one must apply a preprocessing function to the plaintext prior to encryption, 

^ Implementations of ElGamal often use an element p € Zp of prime order q where q is 
much smaller than p. When the set of plaintexts is equal to the subgroup generated 
by g, the Decision DifHe Heilman assumption implies that ElGamal is semantically 
secure. Unfortunately, implementations of ElGamal often encrypt an m-bit message 
by viewing it as an m-bit integer and directly encrypting it. The resulting system is 
not semantically secure - the ciphertext leaks the Legendre symbol of the plaintext. 

T. Okamoto (Ed.): ASIACRYPT 2000, LNCS 1976, pp. 30-43, 2000. 

@ Springer- Verlag Berlin Heidelberg 2000 
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or a conversion to the encryption function (see [10,16,13] for instance). Recent 
standards for RSA [15] use Optimal Asymmetric Encryption Padding (OAEP) 
which is known to be secure against a chosen ciphertext attack in the random 
oracle model [4]. Currently, there is no equivalent preprocessing standard for El- 
Gamal encryption, although several proposals exist [1,10,16,13]. Unfortunately, 
many textbook descriptions of RSA and ElGamal do not view these preprocess- 
ing functions as an integral part of the encryption scheme. Instead, common 
descriptions are content with an explanation of the plain systems. 

In this paper we give a simple, yet powerful, attack against both plain RSA 
and plain ElGamal encryption. The attack illustrates that plain RSA and plain 
ElGamal are fundamentally insecure systems. Hence, any description of these 
cryptosystems cannot ignore the preprocessing steps used in full RSA and full 
ElGamal. Our attack clearly demonstrates the importance of preprocessing. It 
can be used to motivate the need for preprocessing in introductory texts. 

Our attack is based on the fact that public key encryption is typically used 
to encrypt session-keys. These session-keys are typically short, i.e. less than 128 
bits. The attack shows that when using plain RSA or plain ElGamal to encrypt 
an TO-bit key, it is often possible to recover the key in time approximately 2"^/^. 
In environments where session-keys are limited to 64-bit keys {e.g. due to gov- 
ernment regulations), our attack shows that both plain RSA and plain ElGamal 
result in a completely insecure system. We experimented with the attack and 
showed that it works well in practice. 



1.1 Summary of Results 

Suppose the plaintext M is m bits long. For illustration purposes, when m = 64 
we obtain the following results: 

— For any RSA public key {N, e), given C = M® mod N it is possible to recover 
M in the time it takes to compute 2 • 2™/^ modular exponentiations. The 
attack succeeds with probability 18% (the probability is over the choice of 
M G {0, 1, . . . , 2*” — 1}). The algorithm requires 2"‘/^m bits of memory. 

— Let (p, g, y) be an ElGamal public key. When the order of g is at most pjT^, 

it is possible to recover M from any ElGamal ciphertext of M in the time 
it takes to compute 2 • 2™/^ modular exponentiations. The attack succeeds 
with probability 18% (over the choice of M), and requires bits of 

memory. 

~ Let {p, g, y) be an ElGamal public key. Suppose p — 1 = qs where s > 2"* 
and the discrete log problem for subgroups of Z* of order s is tractable, i.e. 
takes time T for some small T. When the order of g is p — 1, it is possible 
to recover M from any ciphertext of M in time T and 2 • 2*”/^ modular 
exponentiations. The attack succeeds with probability 18% (over the choice 
of M), and requires 2"‘/^m bits of memory. 

— Let (p, g, y) be an ElGamal public key. Suppose again p — 1 = qs where 
s > 2*” and the discrete log problem for subgroups of Z* of order s takes 
time T for some small T. When the order of g is either p— 1 or at most pj2™, 
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it is possible to recover M from any ciphertext of M in time T plus one 
modular exponentiation and 2 • 2 ™/^ additions, provided a precomputation 
step depending only on the public key. The success probability is 18% (over 
the choice of M). The precomputations take time and 2™/^ modu- 

lar exponentiations. The space requirement can optionally be decreased to 
2 m /4 .^^i^];jout increasing the computation time, however with a 

loss in the probability of success. 

All attacks can be parallelized, and offer a variety of trade-offs, with respect to 
the computation time, the space requirement, and the probability of success. For 
instance, the success probability of 18% can be raised to 35% if the computation 
time is quadrupled. Note that the first result applies to RSA with an arbitrary 
public exponent (small or large). The attack becomes slightly more efficient when 
the public exponent e is small. The second result applies to the usual method 
in which ElGamal is used in practice. The third result applies when ElGamal 
encryption is done in the entire group, however p — 1 has a small smooth factor (a 
64-bit smooth factor) . The fourth result decreases the on-line work of both the 
second and the third results, provided an additional precomputation stage. It 
can optionally improve the time/memory trade-off. The third and fourth results 
assume that p — 1 contains a smooth factor: such a property was used in other 
attacks against discrete-log schemes (see [2,14] for instance). 

1.2 Splitting Probabilities for Integers 

Our attacks can be viewed as a meet-in-the-middle method based on the fact 
that a relatively small integer (e.g., a session-key) can often be expressed as 
a product of much smaller integers. Note that recent attacks on padding RSA 
signature schemes [7] use related ideas. Roughly speaking, these attacks expect 
certain relatively small numbers (such as hashed messages) to be smooth. Here, 
we will be concerned with the size of divisors. Existing analytic results for the 
bounds we need are relatively weak. Hence, we mainly give experimental results 
obtained using the Pari/GP computer package [3]. 

Let M be a uniformly distributed m-bit integer. We are interested in the 
probability that M can be written as: 

~ M = M 1 M 2 with Ml < 2™i and M 2 < 2™^. See table 1 for some values. 

— M = M 1 M 2 M 3 with Mi < 2’^\ See table 2 for some values. 

— M = M 1 M 2 M 3 M 4 with Mi < 2*”^ See table 3 for some values. 

The experimental results given in the tables have been obtained by factoring 
a large number of randomly chosen m-bit integers with uniform distribution. 
Some theoretical results can be obtained from the book [11]. More precisely, for 
1/2 < a < 1, let Pa{m) be the probability that a uniformly distributed integer 
M in [1 ... 2"* — 1] can be written as M = Mi M 2 with both Mi and M 2 less or 
equal to 2“™. It can be shown that Pi/ 2 (m) tends (slowly) to zero as m grows to 
infinity. This follows (after a little work) from results in [ll][Ghapter 2] on the 
number H{x,y, z) of integers n < x for which there exists a divisor d such that 
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y < d < z. More precisely, the following holds (where log denotes the neperian 
logarithm) : 

log log m- y/logm\ 

m-5 J ’ ^ ^ 

where (5=1— « 0.086. On the other hand, when a > 1/2, Pa{m) 

no longer tends to zero, as one can easily obtain the following asymptotic lower 
bound, which corrects [8, Theorem 4, p 377]: 

liminfPa(TO) > log(2a), (2) 

This is because the probability must include all numbers that are divisible by 
a prime in the interval [2*”/^, 2“™], and the bound follows from well-known 
smoothness probabilities. 

Our attacks offer a variety of trade-offs, due to the freedom in the factor- 
ization form, and in the choices of the mi’s: the splitting probability gives the 
success probability of the attack, the other parameters determine the cost in 
terms of storage and computation time. 






Table 1. Experimental probabilities of splitting into two factors. 



Bit-length m 


mi 


7712 


Probability 


40 


20 


20 


18% 


21 


21 


32% 


22 


22 


39% 


20 


25 


50% 


64 


32 


32 


18% 


33 


33 


29% 


34 


34 


35% 


30 


36 


40% 



Table 2. Experimental probabilities of splitting into three factors. 



Bit-length m 


mi = 7712 = m 3 


Probability 


64 


22 


4% 




23 


6.5% 




24 


9% 




25 


12% 
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Table 3. Experimental probabilities of splitting into four factors. 



Bit-length m 


mi — m2 = mz = m4 


Probability 


64 


16 


0.5% 


20 


3% 



1.3 Organization of the Paper 

In Section 2 we introduce the subgroup rounding problems which inspire all our 
attacks. In Section 3 we present rounding algorithms that break plain ElGamal 
encryption when g generates a “small” subgroup of Z*. Using similar ideas, we 
present in Section 4 an attack on plain ElGamal encryption when g generates 
all Zp, and an attack on plain RSA in Section 5. 

2 The Subgroup Rounding Problems 

Recall that the ElGamal public key system [9] encrypts messages in Z* for some 
prime p. Let g be an element of Z* of order q. The private key is a number in 
the range 1 < x < q. The public key is a tuple (p, (/, y) where y = g^ mod p. 
To encrypt a message M G Zp the original scheme works as follows: (1) pick a 
random r in the range 1 < x < q, and (2) compute u = M ■ y'' mod p and v = 
g^ mod p. The resulting ciphertext is the pair (u, v). To speed up the encryption 
process one often uses an element g of order much smaller than p. For example, 
p may be 1024 bits long while q is only 512 bits long. 

For the rest of this section we assume (/ S Z* is an element of order q where 
q p. For concreteness one may think of p as 1024 bits long and q as 512 bits 
long. Let Gq be the subgroup of Z* generated by g. Observe that Gq is extremely 
sparse in Z*. Only one in 2®^^ elements belongs to Gq. We also assume M is a 
short message of length much smaller than log 2 (p/(?). For example, M is a 64 
bits long session-key. 

To understand the intuition behind the attack it is beneficial to consider a 
slight modification of the ElGamal scheme. After the random r is chosen one 
encrypts a message M by computing u = M + y'" mod p. That is, we “blind” 
the message by adding y” rather than multiplying by it. The ciphertext is then 
{u,v) where v is defined as before. Glearly y” is a random element of Gq. We 
obtain the following picture: 



u 




The X marks represent elements in Gq . Since M is a relatively small number, 
encryption of M amounts to picking a random element in Gq and then slightly 
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moving away from it. Assuming the elements of Gq are uniformly distributed in 
Z* the average gap between elements of Gq is much larger than M. Hence, with 
high probability, there is a unique element z € Gq that is sufficiently close to 
u. More precisely, with high probability there will be a unique element z & Gq 
satisfying \u — z\ < 2®^. If we could find 2 given u we could recover M. Hence, 
we obtain the additive version of the subgroup rounding problem: 

Additive subgroup rounding: let z be an element of Gq and Z\ an integer satisfying 
A < 2™. Given u = z+A mod p find z. When m is sufficiently small, z is uniquely 
determined (with high probability assuming Gq is uniformly distributed in Zp). 

Going back to the original multiplicative ElGamal scheme we obtain the 
multiplicative subgroup rounding problem. 

Multiplicative subgroup rounding: let z be an element of Gq and A an integer 
satisfying A < 2™. Given u = z-A mod p find z. When m is sufficiently small z, is 
uniquely determined (with high probability assuming Gq is uniformly distributed 
in Ip). 

An efficient solution to either problem would imply that the corresponding 
plain ElGamal encryption scheme is insecure. We are interested in solutions 
that run in time 0{\/A) or, even better, 0(logZ\). In the next section we show 
a solution to the multiplicative subgroup rounding problem. 

The reason we refer to these schemes as “plain ElGamal” is that messages 
are encrypted as is. Our attacks show the danger of using the system in this 
way. For proper security one must pre-process the message prior to encryption 
or modify the encryption mechanism. For example, one could use DHAES [1] or 
a result due to Fujisaki and Okamoto [10], or even more recently [16,13]. 

3 Algorithms for Multiplicative Subgroup Rounding 

We are given an element u G Zp of the form u = z - A mod p where z is a random 
element of Gq and jZ\| < 2™. Our goal is to find A, which we can assume to be 
positive. As usual, we assume that m, the length of the message being encrypted, 
is much smaller than log 2 (p/< 7 ). Then with high probability A is unique. For 
example, take p to be 1024 bits long, q to be 512 bits long and m to be 64. 
We first give a simple meet-in-the-middle strategy for multiplicative subgroup 
rounding. By reduction to a knapsack-like problem, we will then improve both 
the on-line computation time and the time/memory trade-off of the method, 
provided that p satisfies an additional, yet realistic, assumption. 

3.1 A Meet-in-the-Middle Method 

Suppose A can be written as A = Ai ■ A 2 where Ai < 2™^ and A 2 < 2’”^. For 
instance, one can take toi = m 2 = m/2. We show how to find A from u in space 
0{2"^^) and 2’”^ -|- 2"*^ modular exponentiations. Observe that 



u = z • A = z ■ Ai ■ A 2 mod p. 
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Dividing by A 2 and raising both sides to the power of q yields: 

{u/A 2 y = ■ Af = Af mod p. 

We can now build a table of size 2™^ containing the values Af mod p for all 
Ai = 0, . . . , 2'^^. Then for each A 2 = 0, . . . , 2™^ we check whether u'^jA'^ mod p 
is present in the table. If so, then Z\ = Z\i • Z \2 is a candidate value for A. 
Assuming A is unique, there will be only be one such candidate, although there 
will probably be several suitable pairs (Z\i, Z\ 2 ). 

The algorithm above requires a priori modular exponentiations and 

2 ”ii log 2 P bits of memory. However, we do not need to store the complete value 
of A\ mod p in the table: A sufficiently large hash value is enough, as we are only 
looking for “collisions” . For instance, one can take the 2 max(mi, TO 2 ) least signif- 
icant bits of A\ mod p, so that the space requirement is only 2™'^“'"^ max(mi, m 2 ) 
bits instead of 2™”^ log 2 p- Less bits are even possible, for we can check the valid- 
ity of the (few) candidates obtained. Note also that the table only depends on p 
and q: the same table can be used for all ciphertexts. For each ciphertext, one 
needs to compute at most 2™^ modular exponentiations. For each exponentia- 
tion, one has to check whether or not it belongs to the table, which can be done 
with 0(mi) comparisons once the table is sorted. 

It is worth noting that Z\i and Z \2 need not be prime. The probability that a 
random m-bit integer (such as A) can be expressed as a product of two integers, 
one being less than mi bits and the other one being less than m 2 bits, is discussed 
in Section 1.2. 

By choosing different values of mi and m 2 (not necessarily m/2), one obtains 
various trade-offs with respect to the computation time, the storage requirement, 
and the success probability. For instance, when the system is used to encrypt 
a 64-bit session key, if we pick mi = m 2 = 32, the algorithm succeeds with 
probability approximately 18% (with respect to the session key), and it requires 
on the order of eight billion exponentiations, far less than the time to compute 
discrete log in Z*. 

We implemented the attack using Victor Shoup’s NTL library [19]. The tim- 
ings should not be considered as optimal, they are meant to give a rough idea of 
the attack efficiency, compared to exhaustive search attacks on the symmetric al- 
gorithm. Running times are given for a single 500 MHz 64-bit DEC Alpha/Linux. 
If m = 40 and mi = m 2 = 20, and we use a 160-bit q and a 512-bit p, the pre- 
computation step takes 40 minutes, and each message is recovered in less than 1 
hour and 30 minutes. From Section 1.2, it also means that, given only the public 
key and the ciphertext, a 40-bit message can be recovered in less than 6 hours 
on a single workstation, with probability 39%. 



3.2 Reduction to Knapsack-like Problems 

We now show how to improve the on-line computation time (2™/^ modular ex- 
ponentiations) and the time/memory trade-off of the method. We transform the 
multiplicative rounding problem into a linear problem, provided that p satisfies 
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the additional assumption p—l = qrs where s > 2™ is such that discrete logs in 
subgroups of Z* of order s can be efficiently computed. For instance, if 
is the prime factorization of s, discrete logs in a cyclic group of order s can be 
computed with ei(logs + y/pi)) group operations and negligible space, 

using Pohlig-Hellman and Pollard’s p methods (see [12]). Let w be a generator 
of Z*. For all x G Z*, belongs to the subgroup Gs of order s generated by 

The linear problem that we will consider is known as the fc-table problem: 
given k tables Ti, . . . ,Tk of integers and a target integer n, the /c-table problem 
is to return all expressions (possibly zero) of n of the form n = ti + t 2 + • • • + tk 
where U € Ti. The general fc-table problem has been studied by Schroeppel and 
Shamir [18], because several NP-complete problems (e.g., the knapsack problem) 
can be reduced to it. We will apply (slightly modified) known solutions to the 
k-table problems, for fc = 2, 3 and 4. 



The Modular 2-Table Problem Suppose that A can be written as Z\ = 
Z\i • Z\ 2 , with 0 < Z\i < 2™i and 0 < Z \2 < 2"*^, as in Section 3.1. We have 
It? = A\A^ modp and therefore: 

= Af AY modp, 

which can be rewritten as 

log(M«'-) = log(Z\f ) + log{Af) mod s, 

where the logarithms are with respect to 

We build a table Ti consisting of log(Z\f’^) for all Z\i = 0, . . . , 2’”i , and a table 
T 2 consisting of log(Z\|’^) for all Z \2 = 0, . . . , 2™=. These tables are independent 
of Z\. The problem is now to express log(u^’’) as a modular sum ti + ^ 2 , where 
ti G Ti and t 2 G Tj- The number of targets ti + t 2 is Hence, we 

expect this problem to have very few solutions when s > The problem 

involves modular sums, but it can of course be viewed as a 2-table problem with 
two targets log(M'^'’) and log(M'^’’) -I- s. The classical method to solve the 2-table 
problem with a target n is the following: 

1. Sort Ti in increasing order; 

2. Sort T 2 in decreasing order; 

3. Repeat until either T\ or T 2 becomes empty (in which case all solutions have 
already been found): 

(a) Compute t = first(Ti) -|- first(T 2 ). 

(b) If t = n, output the solution which has been found, and delete first (Ti) 
from Ti, and first (T 2 ) from T 2 ; 

(c) If t < n delete first(ri) from Ti; 

(d) If t > n delete first(r 2 ) from T 2 ; 



It is easy to see that the method outputs all solutions of the 2-table problem, in 
time The space requirement is 0(2™'^ -I- 2™^). 
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Since the original problem involves modular sums, it seems at first glance 
that we have to apply the previous algorithm twice (with two different targets). 
However, we note that a simple modification of the previous algorithm can in fact 
solve the modular 2-table problem (that is, the 2-table problem with modular 
additions instead of integer additions). The basic idea is the following. Since 
T 2 is sorted in descending order, n — T 2 is sorted in ascending order. The set 
(n — T 2 ) mod s though not necessarily sorted, is almost sorted. More precisely, 
two adjacent numbers are always in the right order, to the exception of a single 
pair. This is because n — T 2 is contained in an interval of length s. The single 
pair of adjacent numbers in reverse order corresponds to the two elements a and 
b of T 2 surrounding s — n. These two elements can easily be found by a simple 
dichotomy search for s — n in T 2 . And once the elements are known, we can 
access (n — T 2 ) (mod s) in ascending order by viewing T 2 as a circular list, 
starting our enumeration of T 2 by 6, and stopping at a. 

The total cost of the method is the following. The precomputation of ta- 
bles Ti and T 2 requires 2™^ -|- 2’”^ modular exponentiations and discrete log 
computations in a subgroup of Z*of order s, and the sort of T\ and T 2 . The 
space requirement is (2"*“^ -I- 2™^) log 2 s bits. For each ciphertext, we require one 
modular exponentiation, one efficient discrete log (to compute the target), and 
2 mm(mi,m 2 )-i-i additions. Hence, we improved the on-line work of the method of 
Section 3.1: loosely speaking, we replaced modular exponentiations by simple 
additions. We now show how to decrease the space requirement of the method. 

The Modular 3- Table Problem The previous approach can easily be ex- 
tended to an arbitrary number of factors of A. Suppose for instance A can be 
written as A — Ai ■ A 2 ■ A^ where each Ai is less than We obtain 

3 

log(u'^’') = ^log(Z\f ) mod s, 

i=l 

where the logarithms are with respect to In a precomputation step, we 
compute in a table Ti all the logarithms of AT mod p for 0 < < 2'"* . We are 

left with a modular 3-table problem with target log(rt'^r). The modular 3-table 
problem with target n modulo s can easily be solved in time (9(2™!+“™!™= '"* 3 )) 
and space 0(2™3 +2'^^). It suffices to apply the modular 2-table algorithm 

on tables T 2 and T 3 , for all targets {n — ti) mod s, with ti G Ti. 

Hence, we decreased the space requirement of the method of Section 3.2, by 
(slightly) increasing the on-line computation work and decreasing the success 
probability (see Section 1.2 for the probability of splitting into three factors). 
More precisely, if toi = m 2 = m 3 = m/3, the on-line work is one modular 
exponentiation, one discrete log in a group of order s, and 2^”/^ additions. Since 
an addition is very cheap, this might be useful for practical purposes. 

The Modular 4- Table Problem Using 3 factors did not improve the time/ 
memory trade-off of the on-line computation work. Indeed, for both modular 2- 
table and modular 3-table problems, our algorithms satisfy TS = 0(2™), where 
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T is the number of additions, and S is the space requirement. Surprisingly, one 
can obtain a better time/memory tradeoff with 4 factors. 

Suppose A can be written as A = A\ ■ A2- A^- A^ where each Ai is less than 
2'^i . For instance, one can take toi = = m3 = m3 = mj A. We show how to 

find A from log(u^’’) in time _l_2™3+m4^ space 0 (X)i=i 2 ’”*), pro- 

vided a precomputation stage of X)i=i 2™* modular exponentiations and discrete 
log computations in a group of order s. 

We have log(u^’’) = X)i=i Again, in a precomputation step, 

we compute in a table Ti all the logarithms of A^’’ mod p for 0 < Z\i < 2™* . 
We are left with a modular 4 -table problem, whose solutions will reveal possible 
choices of Z\i, Z\2, A3 and Z\4. Schroeppel and Shamir [ 18 ] proposed a clever 
solution to the basic 4 -table problem, using the following idea. An obvious solu- 
tion to the 4 -table problem is to solve a 2 -table problem by merging two tables, 
that is, considering sums ti + ^2 and ts -I- ^4 separately. However, the algorithm 
for the 2 -table algorithm described in Section 3.2 accesses the elements of the 
sorted supertables sequentially, and thus there is no need to store all the possible 
combinations simultaneously in memory. All we need is the ability to generate 
them quickly (on-line, upon request) in sorted order. To implement this idea, 
two priority queues are used : 

— Q' stores pairs (ti, t2) from Ti x T2, enables arbitrary insertions and deletions 
to be done in logarithmic time, and makes the pairs with the smallest ti + t2 
sum accessible in constant time. 

— Q" stores pairs (ts, 14) from T3 x T4, enables arbitrary insertions and deletions 
to be done in logarithmic time, and makes the pairs with the largest ts -I- t4 
sum accessible in constant time. 

This leads to the following algorithm for a target n: 

1 . Precomputation: 

— Sort T2 into increasing order, and T4 into decreasing order; 

— Insert into Q' all the pairs (ti, first (T2)) for ti G Ti; 

— Insert into Q" all the pairs (ta, first (P4)) for ta G T3. 

2 . Repeat until either Q' or Q” becomes empty (in which case all solutions 
have been found): 

— Let (ti,t2) be the pair with smallest ti + t2 in Q'; 

— Let (t3,t4) be the pair with largest ta + ^4 in Q"; 

— Compute t = ti + t2 + ts + t4. 

— If t = n, we output the solution, and apply what is planned when t < n 
or t > n. 

— If t < n do 

• delete (^1,^2) from Q'; 

• if the successor t'2 of ^2 in T2 is defined, insert (ti,!^) into Q'; 

— If t > n do 

• delete (^3,^4) from Q”; 

• if the successor t'4 of t4 in T4 is defined, insert (ta,!^) into Q"; 
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At each stage, a G Ti can participate in at most one pair in Q' , and 
can participate in at most one pair in Q" . It follows that the space complexity of 
the priority queues is bounded by 0{\Ti \ + I'TsI) = 0{2™^ + 2’”^). Each possible 
pair can be deleted from Q' at most once, and the same holds for Q" . Since 
at each iteration, one pair is deleted from Q' or Q" , the number of iterations 
cannot exceed the number of possible pairs, which is 

Finally, as in the 2-table case, we note that this algorithm can be adapted to 
modular sums, by changing the starting points in T 2 and T4 to make sure that 
the modular sets are enumerated in the correct order. Hence, it is not necessary 
to apply the 4-table algorithm on 4 targets. If mi = m2 = m3 = m4 = m/4, we 
obtain a time complexity of 0(2™/^) and a space complexity of only 0(2’”/^), 
which improves the time/memory tradeoff of the methods of Sections 3.2 and 3.2. 
The probability that a random m-bit integer (such as A) can be expressed as a 
product of four integers Ai, where Ai has less than m^ bits, is given in Section 1.2. 
Different values of mi, m2, m3 and m4 (not necessarily m/4), give rise to different 
trade-offs with respect to the computation time, the storage requirement, and 
the success probability. 

Our experiments show that, as expected, the method requires much less 
computing power than a brute-force attack on the 64-bit key using the symmetric 
encryption algorithm. We implemented the attack on a PII/Linux-400 MHz. Here 
is a numerical example, using DSS-like parameters: 

q = 762503714763387752235260732711386742425586145191 

p = 124452971950208973279611466845692849852574447655208586550576344180427926821830 
38633894759924784265833354926964504544903320941144896341512703447024972887681 

The 160-bit number q divides the 512-bit number p — 1. The smooth part of 
p — 1 is 4783 • 1759 • 1627 • 139 • 113 • 41 • 11 • 7 • 5 • 2^, which is a 69-bit number. 
Our attack recovered the 64-bit secret message 14327865741237781950 in only 2 
hours and a half (we were lucky, as the maximal running time for 64 bits should 
be around 14 hours). 

4 An Attack on ElGamal Using a Generator of Z* 

So far, our attacks on ElGamal encryption apply when the public key (p,g,y) 
uses an element p G Z* whose order is much less than p. Although many imple- 
mentations of ElGamal use such p, it is worth studying whether a “meet-in-the- 
middle attack” is possible when p generates all of Z*. We show that the answer is 
positive, although we cannot directly use the algorithm for subgroup rounding. 

Let (p, p,p) be an ElGamal public key where p generates all of Z*. Suppose 
an TO-bit message M is encrypted using plain ElGamal, i.e. the ciphertext is 
{u,v) where u = M ■ y'^ and u = p”. Suppose s is a factor of p — 1 so that in 
the subgroup of Z* or order s the discrete log problem is not too difficult (as 
in Section 3.2), i.e. takes time T for some small T. For example, s may be an 
integer with only small prime divisors (a smooth integer). 

We show that when s > 2™ it is often possible to recover the plaintext from 
the ciphertext in time 2™/^m plus the time it takes to compute one discrete log 
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in the subgroup of Z* of order s. We refer to this subgroup as Gg- Note that 
when M is a 64-bit session key the only constraint on p is that p — 1 have a 64 
bit smooth factor. 

Let u = M ■ y'' and v = he an ElGamal ciphertext. As before, suppose 
M = Ml ■ M 2 where both Mi and M 2 are less than 2™/^. Let q = {p—l)/s then: 
Miy'^ = U/M 2 mod p. Hence, 

Mf{y'~y = u^/M| mod p 

We cannot use the technique of Section 3.1 directly since we do not know the 
value of Fortunately, is contained in Gg- Hence, we can compute y’’® 
directly using the public key y and u = y’’. Indeed, suppose we had an integer 
a such that y'^ = {g^Y ■ Then y*"^ = = v^°‘. Computing a amounts to 

computing a single discrete log in Gg. Once a is found the problem is reduced 
to finding {Mi, M 2 ) satisfying: 

= u'^/M^ modp (3) 

The techniques of Section 3.1 can now be used to find all such (Mi, M 2 ) in the 
time it takes to compute 2™/^ exponentiations. Since the subgroup Gg contains 
at least 2™ elements the number of solutions is bounded by m. The correct 
solution can then be easily found by other means, e.y. by trying all m candidate 
plaintexts until one of them succeeds as a “session-key” . 

Note that all the techniques of Section 3.2 can also be applied. The on- 
line work of 2™/^ modular exponentiations is then decreased to 2"*/^ additions, 
provided the precomputation of many discrete log in Gg. Indeed, by taking loga- 
rithms in (3), one is left with a modular 2-table problem. Splitting the unknown 
message M in a different number of factors leads to other modular fc-table prob- 
lems. One can thus obtain various trade-offs with respect to the computation 
time, the memory space, and the probability of success, as described in Sec- 
tion 3.2. 

To summarize, when y generates all of Z* the meet-in-the-middle attack can 
often be used to decrypt ElGamal ciphertexts in time 2™/^ as long as p — 1 
contains an m-bit smooth factor. 

5 A Meet-in-the-Middle Attack on Plain RSA 

To conclude we remark that the same technique used for the subgroup rounding 
problem can be used to attack plain RSA. This was also mentioned in [8]. In 
its simplest form, the RSA system [17] encrypts messages in Z^r where N = pq 
for some large primes p and q. The public key is {N, e) and the private key is 
d, where e ■ d = I mod with = (p — l)(y — !)• A message M G Z^r 

is then encrypted into c = M® mod N . To speed up the encryption process one 
often uses a public exponent e much smaller than N , such as e = 2^® -|- 1. 

Suppose the m-bit message M can be written as M = Mi M 2 with Mi < 2™“^ 
and M 2 < 2™^. Then: 

— = Mf mod N. 

Ml ^ 
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We can now build a table of size 2™”^ containing the values Mf mod N for all 
Ml = 0, . . . , 2"‘b Then for each M 2 = 0, . . . , 2"*^, we check whether c/M^ mod 
N is present in the table. Any collision will reveal the message M . As in Sec- 
tion 3.1, we note that storing the complete value of M^ mod N is not necessary: 
for instance, storing the 2 max(mi , m 2 ) least significant bits should be enough. 
The attack thus requires 2™“^+^ max(TOi, m 2 ) bits of memory and takes 2’”^ mod- 
ular exponentiations (we can assume that the table sort is negligible, compared 
to exponentiations). 

Using a non-optimized implementation (based on the NTL [19] library), we 
obtained the following results. The timings give a rough idea of the attack effi- 
ciency, compared to exhaustive search attacks on the symmetric algorithm. Run- 
ning times are given for a single 500 MHz 64-bit DEC Alpha/Linux. If m = 40 
and mi = m 2 = 20, and we use a public exponent 2^®-|-l with a 512-bit modulus, 
the precomputation step takes 3 minutes, and each message is recovered in less 
than 10 minutes. From Section 1.2, it also means that, given only the public key 
and the ciphertext, a 40-bit message can be recovered in less than 40 minutes 
on a single workstation, with probability at least 39%. 



6 Summary and Open Problems 

We showed that plain RSA and plain ElGamal encryption are fundamentally 
insecure. In particular, when they are used to encrypt an m-bit session-key, the 
key can often be recovered in time approximately 2™/^. Hence, although an 
m-bit key is used, the effective security provided by the system is only m/2 
bits. Theses results demonstrate the importance of adding a preprocessing step 
such as OAEP to RSA and a process such as DHAES to ElGamal. The attack 
presented in the paper can be used to motivate the need for preprocessing in 
introductory descriptions of these systems. 

There are a number of open problems regarding this attack: 

Problem 1: Is there a 0(2™/^) time algorithm for the multiplicative subgroup 
rounding problem that works for all Z\? 

Problem 2: Is there a 0(2*”/^) time algorithm for the additive subgroup round- 
ing problem? 

Problem 3: Gan either the multiplicative or additive problems be solved in 
time less than 17(2™/^)? Is there a sub-exponential algorithm (in 2*”)? 
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Abstract. In 1985 Fell and Difiie proposed constructing trapdoor func- 
tions with multivariate equations [11]. They used several sequentially 
solved stages that combine into a triangular system we call T. In the 
present paper, we study a more general family of TPM (for “Triangle 
Plus Minus”) schemes: a triangular construction mixed with some u ran- 
dom polynomials and with some r of the beginning equations removed. 
We go beyond all previous attacks proposed on such cryptosystems using 
a low degree component of the inverse function. The cryptanalysis of 
TPM is reduced to a simple linear algebra problem called MinRank(r): 
Find a linear combination of given matrices that has a small rank r. 

We introduce a new attack for MiuRank called ‘Kernel Attack’ that 
works for q’" small. We explain that TPM schemes can be used in en- 
cryption only if q’" is small and therefore they are not secure. 

As an application, we showed that the TTM cryptosystem proposed by 
T.T. Moh at CrypTec’99 [15,16] reduces to MiuRank(2). Thus, though 
the cleartext size is 512 bits, we break it in 0(2®^). The particular TTM 
of [15,16] can be broken in 0(2^®) due additional weaknesses, and we 
needed only few minutes to solve the challenge TTM 2.1. from the website 
of the TTM selling company, US Data Security. 

We also studied TPM in signature, possible only if small. It is equally 
insecure: the ‘Degeneracy Attack’ we introduce runs in polynomial. 



1 Introduction 

The current research effort in practical public key cryptography introduced by 
Rivest, Shamir and Adleman, with univariate polynomials over is following 
two paths. The first is considering more complex groups, e.g. elliptic curves. The 
second is considering multivariate equations. Though many proposed schemes 
are being broken, some remain unbroken even for the simplest groups like 7Zj2- 
One of the paradigms for constructing multivariate trapdoor cryptosystems 
is the triangular construction, proposed initially in an iterated form by Fell and 
Diffie (1985). It uses equations that involve 1, 2, . . . , n variables and are solved 
sequentially. The special form of the equations is hidden by two linear transfor- 
mations on inputs (variables) and outputs (equations). We call T this triangular 
construction. Let TPM (T Plus-Minus) be T with added final u random (full- 
size) quadratic polynomials, and with r of the beginning equations removed. 

T. Okamoto (Ed.): ASIACRYPT 2000, LNCS 1976, pp. 44-57, 2000. 

@ Springer- Verlag Berlin Heidelberg 2000 
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The cryptosystem TTM, proposed by T.T. Moh at CrypTec’99 is in spite 
of an apparent complexity, shown in 2.4 to be a subcase of TPM. The initially 
proposed scheme is very weak due to linear dependencies and in section 4.2, 
we present the solution (plaintext) to the TTM 2.1 challenge proposed by the 
company US Data Security, which is currently selling implementations of TTM. 
After this, we focus on breaking more general TPM schemes. 

The general strategy to recover the secret key of TPM/TTM systems is pre- 
sented in 3. It requires finding a linear combination of public equations that de- 
pends only of a subspace of variables. This gives a simple linear algebra problem 
called MinRank: Let us consider some n x n matrices over GF(( 7 ): Mi, . . . ,Mt- 
We need to find a linear combination M of the Mi that has a small rank r < n. 
The name of MinRank has apparently been used first in the paper [19] that shows 
that MinRank is NP-complete. However the MinRank instances in TPM/TTM 
use very small r, e.g. the T.T. Moh’s proposal from [16] gives r = 2. We note 
that the powerful idea of using a small rank goes back to the cryptanalysis of 
Shamir birational scheme [20] by Coppersmith, Stern and Vaudenay [6,7], and 
appears also in the Shamir-Kipnis attack on HFE [14] proposed by Patarin [17]. 

In 2.2 we explain how to use the TPM schemes in encryption which is possible 
only if is small. However, in the section 5 we present an attack that works 
precisely when g'’ is small, based on the small co-dimension of the kernel of 
the unknown matrix M. This ‘Kernel attack’ breaks in approximately 2®^ a 
cryptosystem with 512 bit cleartexts. 

Similarly in 2.2 we explain how to use the TPM schemes in signature; pos- 
sible only with g“ not too big. Then in section 6 we introduce the ‘Degeneracy 
attack’ on TPM based on iterative searching of degenerate polynomials. It works 
precisely when g“ is small and the signature proposals of [15,16] are insecure. 

2 The TPM Family of Cryptosystems 

2.1 General Description of TPM 

In the present section, we describe the general family TPM(n, m, r, K), with: 

— n, u, r integers such that r < n. We also systematically put m = n + u — r. 

— K = GF(g) a finite field. 

We first consider a function W : K” i-^- such that (j/i, . . . , yn+u-r) = 

^{xi , . . . , Xn) is defined by the following system of equations: 



yi 


= Xi + 


9i{ 


Xn- 


-r-l-lj • ■ 


■ 5 Xn) 


V2 


= X2 + 


92{xi; 


Xn- 


-r-l-lj • ■ 


■ 1 Xn) 


2/3 


= 3 : 3 - 1 - 
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yn-r-\-u 
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with each gi {l<i<n + u — r) being a randomly chosen quadratic polynomial. 

The Public Key 

The user selects a random invertible affine transformation s : K" K”, and a 
random invertible affine transformation t : i-^. j^n+u-r ^ p _ 

By construction, if we denote (yj, . . . , = F{x \, . . . , x'^), we obtain an 

explicit set {Pi, . . . , P„+„_r} of {n + u — r) quadratic polynomials in n variables, 
such that: 

r y[ = Pi{x[,...,x'J 

(. Vn+u-r ~ Pn+u-r{Xi, . . . , X^) 

This set of (n + u — r) quadratic polynomials constitute the public key of 
this TPM(n, u, r, K) cryptosystem. Its size is |(n + u — r)(n + 1)(|^ + 1) log 2 ((?) 
bytes. 

r variables 



m equations 



t 



I M 




r “removed” 
equations 

' ' 

I . 

n — r 

“triangular” 

equations 



' ' 
d . 

u “added” 
equations 



Fig. 1. General view of the TPM scheme - The two classes of attacks 

2.2 Encryption Protocol (when u > r) 

Encrypting a message 

Given a plaintext (x[,. . . , x'„) G K”, the sender computes g' = Pi(x[,. . . , x'„) 
for l<t<n + u — r - thanks to the public key - and sends the ciphertext 
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Decrypting a message 

Given a ciphertext {y[, . . . , G ]^;n+u-r^ legitimate receiver recovers 

the plaintext by the following method. 

— Compute {yi,...,yn+u-r) = t~\y[, . . . ,y'„+^_^) ; 

— Make an exhaustive search on the r-tuple {xn-r+i, ■ ■ ■ ,Xn) G K'^, until the 
n-tuple {xi,. .. ,Xn) obtained by Xi = yi - gi{xi, . . . ,Xi-i\Xn-r+i, ■ ■■,Xn) 
(for 1 < z < n — r) satisfies the u following equations gi(a:i , • . • , Xn) = yi (for 
n — r+l<i<n — r + u). 

— For the obtained {x\, . . . ,Xn) n-tuple, get {x'l, . . . ,x'„) = s“^(a;i, . . . ,a;„). 

This decryption algorithm thus has a complexity essentially 0{q^). As a result, 
a TPM(rz, u, r, K) cryptosystem can be practically used in encryption mode only 
under the assumption that is “small enough” . 

The condition u > r insures that the probability of obtaining a collision 
is negligible, and thus that the ciphering function F can be considered as an 
injection from A"” into 

When r = u = 0, this kind of scheme has been considered and attacked 
by Fell and Diffie in [11] (in an iterated form) and by Patarin and Goubin in 
[18]. All these attacks explore the fact that the inverse function if of low degree 
in some variables, whereas the present paper cryptanalyses much more general 
cases with r yf 0 and u yf 0. 



2.3 Signature Protocol (when u < r) 

Signing a message 

Given a message M, we suppose that {y[, . . . , = h{M) G with 

h being a (collision-free) hash function. To sign the message M, the legitimate 
user: 

— computes (yi, . . . ,y„+„_r) = t~^{y[, . . ; 

— chooses random r-tuples (xn-r+i, ■ ■ ■ , Xn), until the rz-tuple {x \, . . . , Xn) ob- 
tained by Xi = yi — gi{xi , . . . , Xi-i; Xn-r+i, ■ ■ ■ , Xn) (for 1 < z < n — r) satis- 
fies the u following equations gi{x \, . . . , Xn) = yi (for n—r+1 < i < n—r+u). 

— for the obtained (a;i, . . . , x„) zz-tuple, gets (x(, . . . , x(j) = s“^(a;i, . . . , Xn)- 

This signature algorithm thus has a complexity essentially 0{q^). As a result, 
a TPM(n, u, r, K) cryptosystem can be practically used in signature mode only 
under the assumption that is “small enough” . 

The condition u < r insures that the probability of finding no solution for 
(xi , . . . , Xn) for the equation ^{xi , . . . , Xn) = (j/i, • ■ . , yn+u-r) is negligible, and 
thus that the ciphering function F can be viewed as an surjection from AT” onto 

j^n-\-u—r 

We will describe in section 6 a general attack on this signature scheme, that is 
also applicable when u is non-zero, with <7“ not too large. Therefore the signature 
proposed by T.T. Moh in [15,16] is insecure. 
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2.4 The TTM Encryption System 

In the present section, we recall the original description of the TTM cryptosys- 
tem, given by T.T. Moh in [15,16]. This definition of TTM is based on the 
concept of tame automorphisms. As we will see, TTM is a particular case of our 
general family TPM: it belongs to the family TPM(64, 38, 2,GF(256)). 



General Principle 

Let K be a finite field (which will be supposed “small” in real applications) . We 
first consider two bijections ^2 and ^3 from to with (zi, . . . , Zn+v) = 

^ 2 (xi, , x„+y) and (yi, . . . , yn+v) = ^ 3 ( 21 , • • ■ , Zn+v) defined by the two fol- 

lowing systems of equations : 



f Zi = Xi 

Z2 = X2 + f2(xi) 

Z 3 = X 3 + f3(xi,X2) 

Zn — Xn ~\~ fn ix\ , . . . , Xn— i ) 

Zn+1 — ^n-t-1 fn+1 {xi , ■ ■ ■ , Xn') 



' yi= Zl+ P{Zn+l, ■ • • , Zn+v) 
y2 = Z2 + Q{Zn+l, ..., Zn+v) 



<p3'. {y3 = Z 3 



^ 2/n+u — ^n-\-v 



t ^n-\-v — ^n-\-v ■ 5 — l) 



with / 2 , . . . , fn+v quadratic forms over K, and P, Q two polynomials of degree 
eight over K . 

<p 2 and <p 3 are both “tame automorphisms” (see [15,16] for a definition) and 
thus are one-to-one transformations. As a result, (a;i, . . . , Xn+v) (yi) • ■ • 7 yn+v) 
= <P^o<p 2 {xi, . . . ,Xn+v) is also one-to-one and can be described by the following 
system of equations : 

yi — Xi P (^Xn+l fn+l(.Xlj . . . , Xn)^ ■ ■ ■ 7 Xn+v fn+v(.Xlj ... 7 Xn+v — l)^ 
y2 — X 2 ~\~ /* 2 (^l) Q i^Xn+1 fn+l(.Xlj .., Xn)^ .., Xn+v ~b fn+v(.Xlj .., Xn+v — l)') 
y3 = X3 + f3{xi,X2) 

yn — Xn fn(x \ , . . . , Xn—l) 
yn+1 — Xn+1 ~b fn+1 {xi , . . . , Xn) 



\ yn+v — Xn+v fn+v\X\.^ • ■ ■ 7 Xn+v — l) 

T.T. Moh found a clever way of choosing P, Q and fi such that yi and y 2 both 
become quadratic functions of a;i, . . . , when we set x„+i = ... = Xn+v = 0. 

Actual Parameters 

This paragraph is given in the appendix. T.T. Moh chooses n = 64, u = 36 and 
K = GF(256). As a result, TTM belongs to TPM(64, 38, 2,GF(256)). Applying 
the formula of section 2.1, the size of the public keys is 214.5 Ko. 
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3 General Strategy of TPM Attacks 

In the present section, we describe a general strategy to attack a cryptosystem 
of the TPM Family when r is “small” . It will amount to solving the MinRank 
problem. As a result TTM, that is a TPM(64, 38, 2,GF(256)) will be broken. 



3.1 The MinRank Problem 



Let r be an integer and K a field. We denote by MinRank (r) the following 
problem: given a set {Mi, . . . , Mm} oinxn matrices whose coefficients lie in K, 



find at least one m-tuple (Ai, 



. ,\m) G AT™ such that Rank(" ^ XiM^ 

^ i=l ' 



< r. 



The (even more) general MinRank problem has been first defined and studied 
by Shallit, Frandsen and Buss in [19]. It generalizes the “Rank Distance Coding” 
problem by Gabidulin [12], studied also in [3,22]), which itself generalizes the 
“Minimal Weight” problem of error correcting codes (see [1,21,2,13]). In the 
Shamir-Kipnis attack on the Patarin’s HFE cryptosystem [14,17], the authors 
used an instance of MinRank(r) with r = ]"log^ n] + 1 and therefore their attack 
is not polynomial. In the present paper r is a small constant, e.g. 2. We note 
that the idea of finding small ranks has first been used by Coppersmith, Stern 
and Vaudenay in [6,7] for breaking Shamir’s birational scheme [20]. 

Recently Courtois proposed a new zero-knowledge scheme based on Min- 
Rank [10,9]. Though in the present paper only two algorithms for MinRank are 
introduced, another two can be found in [9] . 



3.2 Complexity of MinRank 

The general MinRank problem has been proven to be NP-complete by Shallit, 
Frandsen and Buss (see [19]). More precisely, they prove that MinRank(r) NP- 
complete when r = n — 1 (this corresponds to the problem of finding a linear 
combination of Mi, . . . , Mm that is singular). The principle of their proof consists 
in writing any set of multivariate equations as an instance of MinRank. It can 
be used in the same way to extend their result to the cases r = n — 2, r = n — 3, 
. . . and even r = n°^ (when a > 0 is fixed) . However, MinRank is not hard 
when r gets smaller, indeed, in 5 we will introduce an expected polynomial time 
algorithm to solve the MinRank for any fixed r. 



3.3 Strategy of Attack 

We recall that m = n+u—r. We suppose m < 2n, as an encryption function with 
expansion rate > 2 is unacceptable. Moreover, if m > 0{n), the cryptosystem is 
expected to be broken by Grobner bases [8] . 

In each equation yi = Xi + gi{x \, . . . , Xi-\ ; Xn-r+i, • ■ • , Xn) (1 < f < u — r), 
the homogeneous part is given by ^XAiX, with *A = (xi, . . . ,Xn), Ai being a 
(secret) matrix. Similarly, in each public equation y' = Pi{x}, . ■ ■ Mn) is given 
by ^X' MiX' , with = (x'l, . . . , x'^), Mi being a (public) matrix. 
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The fact that {xi,. ..,Xn) = s{x[, ■■■,x'„) and {y[, . . .,y'^) = t{yi, . . .,ym) 
implies that there exist an invertible n x n matrix S and an invertible m x m 
matrix T such that: 

l\SX')Ai{SX') \ nX'MiX' 

\\SX')A^{SX') ) yx'MmX' 

Let T~^ = We thus have, for any X': 




^X'CSA,S)X' = ^X'i^^UjMj'^X' 

i=i 



so that: 

m 

Vi, 1 < i < TO, = ^SAiS. 

i=i 



From the construction of TPM(n, u, r, iF), we have Rank(^i) < r. Since 
S is an invertible matrix, we have Rank(Ai) = Rank(*S'v4iS') and thus Rank 

( S ) < r, that is precisely an instance of MinRank(r). 

Vj=i / 

Suppose we are able to find (at least) one TO-tuple (Ai,...,Am) such that 



Rank! ^ AjMj 

^i=i 



< r. 



With a good probability, we can suppose that: 



XjMj = fi^SAiS (m e K*). 
i=i 

Then we deduce the vector spaces Vq = x {0}’’) (corresponding to 

Xn-r+i = ■ ■ ■ = Xn = 0) and Wo = S'“^({0}”“’’ X K^) (corresponding to xi = 
. . . = Xn-r = 0) by simply noticing that Vq = XjMjAi^ and Wq = 

Ker(E7=iA,M^-^i). 

Once we have found Vq and Wq, we can easily deduce the vector space Vi = 
S'“^({0} X X {0}’’) of dimension 1 (corresponding to xi = Xn-r+i = 

. . . = x„ = 0) and Wi = S~^(K x {0}”“’’“^ x K'') (corresponding to X 2 = 
. . . = Xn-r = 0): we just look for coefficients «i, . . . , a„, /3i, . . . , /3m such that 
the following equation: 

m n 

i=i i=l 

holds for any element of Vq. This can be obtained by simple Gaussian reduction. 
We also obtain the g 2 quadratic function by Gaussian reduction. 
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By repeating these steps, we obtain two sequences of vector spaces: 
hb 2 Vi 3 V 2 3 . . . D Vn-r-l 
Wo CWi CW2 C ... C Wn-r-1- 

At the end, we have completely determined the secret transformations s and 
t, together with the secret functions gi. As a result, this algorithm completely 
breaks the TPM family of cryptosystems (we recovered the secret key). 

4 Special Case Attacks on TPM 

4.1 The ‘Linearity Attack’ on TTM 

In this paragraph, we study the particular case of TTM, as described by T.T. 
Moh in [15,16]. In this case, we show that the MinRank(r) problem is easily 
solved, because of the particular structure of the Qs function used in ^ 3 . 

Description of the Attack 

In section 3.3, we proved that an attack can be successfully performed on 
TTM this cryptosystem, as soon as we can find out the vector spaces Vq = 
S'“^({0}^ X (corresponding to xi = X2 = 0) and Wo = x {0}®^) 

(corresponding to X 3 = . . . = xq 4^ = 0). At first sight, the equations giving yi 
and U2 seem to be quadratic in (a;i, . . . , X04). This leads a priori to an instance 
of MinRank(2). 

However, note that the function x 1-^ x'^ is linear on A = GF(256), consid- 
ered as a vector space of dimension 8 over F = GF(2). Therefore, considering 
the equations describing the (secret) F function of TTM^, if we choose a basis 
{oji, . . . ,ojs) of K over F and write Xi = Xi^iUJi + . . . + Xi^s‘^8 (1 < * < 64), yi and 
y2 become linear functions of xi^i,xi^2, ■ ■ ■ , ■ ■ ■ , a^ 64 ,i) • ■ • > a^ 64 , 8 - In terms of 

MinRank, this means that TTM leads to an instance of MinRank(O) for 8n x 8n 
matrices (instead of an instance of MinRank(2) for n x n matrices). This leads 
to the following attack on TTM: 

1. Let x'^ = x[ iUi\ + . . - + x[ gWg (1 < f < 64). Rewrite each public equation y[ = 
Pi{x \, . . . , ^ 04 ) as yi = Pi{x'-^ u • • • j 2^54 g) (with Pi a quadratic polynomial 
in 64 X 8 = 512 variables over F = GF(2)). 

2. Find the vector space of the 612-tuples (/3i, . . . ,/3ioOj cri,!) ■ • ■ j o: 64 , 8 ) G 
satisfying: 

100 64 8 

i—1 i—1 j—1 

This can be done by Gaussian reduction. We thus obtain the vector spaces 
Vo and Wo defined above. 

3. The remaining part of the attack is exactly the same as in section 3.3. 

^ See (E) in the appendix, in which tig is a linear transformation. 




52 



Louis Goubin and Nicolas T. Courtois 



Complexity of the Attack 

The main part of the algorithm consists in solving a system of linear equa- 
tions on 612 variables, by Gaussian reduction. We thus obtain a complexity of 
approximately 2^® elementary operations to break TTM. 



4.2 Solution to the TTM 2.1 Challenge of US Data Security 

In 1997, US Data Security published on the internet 3 challenges about TTM (see 
[23]). On May 2"'^, 2000, we managed to break the second challenge called TTM 
2.1. The TTM 2.1 is a public key block cipher with plaintext block size 64 and 
ciphertext block size 100. It works on 8 bits finite field GF(256). The public key 
have been recovered with approximately 2000 queries to the “encryption oracle” 
available on the internet [23]. As mentioned in 2.4, its size is 214.5 Kbytes. 
Moreover it was broken in a simpler way that we described above. By iterative 
exploration of it’s linearities, in 3 minutes on a PG we obtained the following 
plaintext which can be easily checked to be the exact solution to TTM 2.1. (note 
that the quotation marks are part of this plaintext): 

"Tao TTP way BGKP of living hui mountain wen river moon love pt" 



5 The ‘Kernel Attack’ on MinRank and TPM 

In the present section we need the strategy of attack from 3.3 and use it with a 
new attack on MinRank(r), which works when g” is small enough. 

Description of the Attack (notations are as in 3.3) 

1. Ghoose k random vectors X'^^\ . . . , (with k an integer depending on n 
and TO, that we define below). Since dim Ker(*S'AiS') = n— Rank(‘S'AiS') > 
n — r, we have the simultaneous conditions A'W G Ker(^SAiS) (1 < i < k) 
with a probability > 

2. We suppose we have chosen a “good” set . . . , of k vectors (i.e. 

such that they all belong to Ker(*S'AiS')). Then we can find an m-tuple 

( m \ 

= 0- They 

are solution of a system of kn linear equations in to indeterminates. As 
a result, if we let k = ["^], the solution is essentially unique and can be 
easily found by Gaussian reduction. We thus obtain the two vector spaces 
Vo = X {0}”) (corresponding to Xn-r+i = ■ ■ ■ = x„ = 0) and 

Wo = S'“^({0}”“” X A”) (corresponding to a;i = . . . = Xn-r = 0). 

3. The remaining part of the attack is exactly the same as in section 3.3. 

Complexity of the Attack 

The complexity of the attack is easily computed: ■ wA). 
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Application to TTM 

In the particular case of TTM, we have q = 256, n = 64, m = 100 and r = 2. 
We thus obtain an attack on TTM with complexity 0(2^^). 

Note: Compared to the 2^® of section 4.1, this attack is slower, but it does 

not make use of any linearity of yi and j/ 2 , so that it can also be used to break 
possible generalizations of TTM, with more general “Qs components” (see [4] for 
examples of Qs which provide non linear expressions for yi and y 2 over GF(2)). 

6 The ‘Degeneracy Attack’ on TPM Signature Schemes 

We describe here a general attack on TMP signature schemes (recall that such 
schemes are possible only for u < r), when is not too large. From the descrip- 
tion of the attack, its complexity is easily seen to be 0(g“ • n®). We use the same 
notations as in section 3.3. In particular, m = n + u — r. 

1. We choose a random m-tuple (/3i, . . . , (3m) G A™. With a probability 

we can suppose that (3iPi is a degenerate quadratic polynomial (i.e. a quadratic 
polynomial which can be rewritten with fewer variables after a linear change 
of variables). The fact that a quadratic polynomial is degenerate can easily 
be detected: for instance by using its canonical form (see [18] for some other 
methods) . 

2. Suppose we have found a “good” m-tuple (/3i, . . . ,(3m)- Considering the new 

m 

set of (< n) variables for the quadratic form ^ PiPi, we deduce easily the 

i—1 

vector space Wn-r = x {0} x KQ. 

3. Then we look for a n-tuple («i, . . . , o;„) G A” and a quadratic function 
such that: 

m n 

YI ^* 2 /* = Y + 9n-r{x'i 

i=l i=l 

is true for any (x(, . . . , x'^) G Wn-r- This can be done by Gaussian reduction. 
We thus obtain the vector space Vn-r = x K x {0}”) and the 

quadratic polynomial gn-r- 

4. The same principle can be repeated n—r times, so as to obtain two sequences 
of vector spaces: 

Vn-r C Vn-r-1 Q . . . C Vq 
W n-r 2 Wn-r-1 2 • ■ • 2 Wq. 

At the end, as in the attack described in section 3.3, we have completely 
determined the secret transformations s and t, together with the secret func- 
tions Qi- As a result, this algorithm completely breaks the TPM family in 
signature mode (we recovered the secret key). 




54 



Louis Goubin and Nicolas T. Courtois 



7 Conclusion 

We cryptanalysed a large class of cryptosystems TPM, that includes TTM as 
described by T.T. Moh [16]. They can be broken in polynomial time, as long as 
r is fixed. The proposed TTM cryptosystem [16] can be broken in 2^® due to 
linearities. Thus we could easily break the “TTM 2.1” challenge proposed by US 
Data Security in October 1997. Even if Qs was nonlinear, and since r = 2, it is 
still broken in 2®^ elementary operations for a 512-bit cryptosystem. 

We also showed that signature schemes using TPM are insecure. There is 
very little hope that a secure triangular system will ever be proposed. 
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Appendix: Actual Parameters for the TTM Cryptosystem 

Let Qs be the function defined by 

Qaiqi, ■■■, qso) = qf + qw + qlo + [qi + qhl + qlql + qlqi2 + qhlsl 

^ [<79 + (9i0 + 914915 + 918'Z19 + 'Z20 'Z21 + <Z22<Z24)(<Zii + 916917 + 923928 + 925 'Z26 + 'Z13'727)]- 
However we obtain Qsiqi, ■ ■ ■ , 930) = tig as soon as we substitute the <7i,,3o with: 



9l — ti + tgte 


q2 = i^+ tstr 


93 = tg + t4tio 


94 = 


95 = tstii 


qe = titj 


qr = 




99 = tg + ^8^9 


9l0 = tg + tl2tl3 


9ll = tg -b ti4ti5 


9i2 = trtio 


9l3 = tiotll 


914 = + ^7^8 


9i5 = ti3 + tiitie 


9l6 = ti 4 + tlotl 2 


917 = ti5 + tlltl7 


9l8 = tl2tl6 


9l9 = tiiti2 


920 = tgti3 


921 = ^7^13 


922 = tstl6 


923 = ti4ti7 


924 = trtii 


925 = tl2tl5 


926 = tlotl5 


927 = ti2tl7 


928 = tiiti4 


929 = tl8 + ti 


930 = tl9 + tig 
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We put n = 64, V = 36, and we consider the ti = ti{ui, . . . , uig) (1 < z < 19) 
as randomly chosen linear forms (i.e. homogeneous polynomials of degree one in 
Ml, , uig), satisfying the following conditions: 



- ti(Mi, . . . ,Mig) = Ml ; 

— , Mig) = Mi8 ; 



— tig(Mi, . . . , Mig) — Mig ; 



- te{ui, Mig), triui, Mig), tis{ui, Mig) and tig(Mi, . . . , Mig) depend 
only on the variables Mg, My, . . . , Miy, 



We thus obtain polynomials qi = qi{u\, . . . ,Mig) (1 < i < 30) of degree two in 
Ml, ... , Mig. Finally, we choose: 



' P{zq5, 

Q(z65, 
feiixi, 
f62(xi, 
fesixi, 
fe4(xi, 

< fesixi, 

fg2(xi , . . . , Xgi) = qgsixg, a:ii, . . . , a;i 6 , x^i , . . . , Xgg) 

/*93 (^1 ? • ■ • j Xq2 ) q\ix\g^ X\^j , . . . , X 2 O: Xi^, XiQ, ■ ■ ■ 5 ^60 : ^63 j Xq4.^ 

. flOo(,Xi ,!..., Xgg) ^7s(^10j Xi^ , . . . ,3:205^15j^l6?^5l5 ■ ■ ■ 5 ^60 5 ^63? Xf^ 4 ^ 



■ ■ ) 2100) — <38(293) ■ • ■ ) 2l00) 273) • • ■ ) 2g2) 263) 254) 

■ • ) 2100) = <3 s( 265) • ■ • ) 292) 261, 262) 

■ • ) ^60 ) ^29 (^9 )^11) ■ ■ ■ )^16) X^± , , X62) 3:-61 

• ■ ) a^6i) = <73o(a^9) a^ii) • • • ) a^i6) 2:51, . . . , 2:62) — 2:62 

• ■ ) 2:62) = <729(2:10) 2 <i 7) . . . , 2<20) a;i5, X16, 2:51, . . . , a;60) 2:63, 2:64) — 2:63 

■ ■ ) 2:63) <?30 (2:10 ) 2 :i 7) ■ ■ ■ ) 2<20 ) 2:15 ) 2:16 ) 2:51 , . . . , 2:60) 2:63 , 2:54) 2:64 

. . , 2:64) <7l (2:9 ,2:11, . . . ,Xi 6)2:51, . . . , 2:62 ) 



and randomly chosen quadratic forms for fi {2 < i < 60). 
Let us denote 9 : the function defined by 

9{xi, ..., xm) = ( 2 : 1 , ... , 2:64) 0, . . . , 0). 



Hence (a;i, . . . ,2:64) { ui , ■ ■ ■ , 27 ioo) = ^3 ° ^2 ° 6<(2:i, . . . ,2:64) is given by the 
following system: 
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' yi=Xi + [tig{xg,Xii , . . . ,Xi6,a;5i, • ■ • ,3^62)]^ (= a;i + XI 2 ) 

V 2 = X 2 + f 2 {xi) + [tig(a;io, a^i7, • ■ • , a^ 2 o, a^is, a;i 6 , 2 : 51 , , xeo, XQi)Y 

(= X 2 + f 2 {xi) + XI 4 ) 

V3 = X3 + f3{xi,X2) 



{E){ 



2/60 = 3^60 + feoixi, . . . , X5q) 

2/61 = q29{xQ,xii ,. . . ,a;i 6 ,a: 5 i, . . . ,X62) (= aiei + xl) 

2/62 = q3o(xQ, a;ii, . . . , x\q, X 51 , . . . , X 62 ) (= ai62 + xh) 

2/63 = </29(3;ioj a;i7, . . . , 0 : 20 , a;i5, xie, X 51 , . . . , xqo, xq3, xs4) 
2/64 Q30 (x^q ^ XiY ^ . . . , X 20 j 3^15 5 XiQ^ X 34 , . . . , X60? 3^63 5 31Q4 ) 

2/65 ( 3^9 i31ii, . . . X 51 , . . . , X 32 ) 



(= 3;63 + xfg) 
(= a ;64 + xl^) 



2/92 — q28{x9, Xu, . . . , Xie, X 51 , . . . , 3162 ) 

2/93 = 91 ( 3 ^ 10 ) a;i7, . . . , X 20 , a;i5, iaiie, 2 : 51 , . . . , a;6o, xqs, xq4) 



I 2/100 — 9 s( 3 :i 0 j aii 7 , . . . , 3:20, 3:15, 31 i 6 , 3:51, . . . , S; 60 , Xqs , Xq 4 ) 



The Public Key 

The user selects a random invertible affine transformation ^ 

and a random invertible affine transformation <p4 : — > K^®®, such that the 

function F = <p4 o o <p2 o 6 o <Pi satisfies 

f’(0,...,0) = (0,...,0). 

By construction of F, if we denote (y[, . . . , 2/100) = E{x'i, . . . , x'q4), then we 
have an explicit set {Pi, . . . , Pioo} of 100 quadratic polynomials in 64 variables, 
such that: 

{ y'l = Pi{x'-4,...,x'q4 ) 

2/100 = ^’100(3^11 . . . ) x'^4) 

This set of 100 polynomials constitutes the public key of the TTM cryptosystem. 

Encrypting a Message 

Given a plaintext (3;'i, . . . , x'^4) € K®^, the sender computes yi = Pi{x [, . . . , 3:54) 
for 1 < 7 < 100 (thanks to the public key) and sends the ciphertext 2/100) ■ 

Decrypting a Message 

Given a ciphertext (i/i, ..., 2/100) ^ K^®®, the legitimate receiver recovers the 
plaintext by: 

( 3 ;'i,.. .,3:64) = ^1"^ 0 7to<?2"^ o<?3"1 o<?3"1 o<?4"1(2/'i,.. .,2/ioo) 

with 7 T : K^®® 1-^ K®"^ defined by 77(3:1 , . . . , a;ioo) = (a:i , . . . , X34) and thus satisfies 
77 o 6 * = Id. 
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Abstract. Batch verification can provide large computational savings 
when several signatures, or other constructs, are verified together. Sev- 
eral batch verification algorithms have been published in recent years, 
in particular for both DSA-type and RSA signatures. We describe new 
attacks on several of these published schemes. A general weakness is 
explained which applies to almost all known batch verifiers for discrete 
logarithm based signature schemes. It is shown how this weakness can be 
eliminated given extra properties about the underlying group structure. 
A new general batch verifier for exponentiation in any cyclic group is 
also described as well as a batch verifier for modified RSA signatures. 



1 Introduction 

Modular exponentiation is a fundamental operation for most practical digital 
signature schemes. The computational expense of both signing and verifying 
signatures is mainly due to the modular exponentiation required. Several tech- 
niques have been proposed in the literature to reduce this expense, including use 
of small exponents, and multi-exponentiation techniques [21]. An alternative way 
to realize a computational reduction is through use of batch cryptography. 

Batch cryptography is relevant in settings where many signatures (or other 
primitives) need to be generated and/or verified together. Electronic commerce 
applications are prime examples, as typically many customers interact with the 
same merchant or banking server. Although techniques have been developed to 
improve signature generation [6,16], the majority of the recent work in the area 
has focused on the batch verification of signatures. These techniques all exploit 
the homomorphic properties of exponentiation in various groups to combine a 
set of exponentiations into one equation whose computational effort is effectively 
divided amongst all the individual exponentiations required. 

The purpose of this paper is to illustrate flaws in a number of published batch 
verifiers; in some cases they are broken whilst in others we show that they do 
not provide the strength of verification claimed. We show that an observation 
of Bellare et al. [1] , regarding the restrictions on use of certain batch verifiers, 
has much more serious consequences than they imply; in most applications this 

T. Okamoto (Ed.): ASIACRYPT 2000, LNCS 1976, pp. 58-71, 2000. 

© Springer- Verlag Berlin Heidelberg 2000 




Attacking and Repairing Batch Verification Schemes 



59 



makes the tests ineffective. Through stronger assumptions on the group structure 
we show how these tests may be repaired. 

1.1 Background 

The idea of batch cryptography was introduced by Fiat [6,7]; his scheme amor- 
tized the private key operations for RSA and so was designed to assist in the 
signing and decryption operations. His idea was to batch a number of messages 
together, perform one full-scale modular exponentiation to sign the messages si- 
multaneously, and then split apart the batch into individually signed messages. 
This is achievable due to the homomorphic property of RSA and the use of 
multiple, relatively prime, public exponents, an idea introduced by Chaum [4]. 

Batch verification for DSA signatures was introduced by Naccache, M’Raihi, 
Raphaeli and Vaudenay [15]. Their scheme is designed to verify several DSA 
signatures at once by checking that a batch criterion holds and is much more 
efficient than sequential verification of individual DSA signatures^. Harn sub- 
sequently proposed a new method for DSA signatures requiring interaction be- 
tween signer and verifier [10] and later devised a non-interactive version [11]. 

Early work concerning (non-interactive) batch verification was also published 
by Yen and Laih [22] . Their verification techniques are proposed for batch veri- 
fication of a modification of the Schnorr or Brickell-McCurley signature schemes 
as well as for RSA. The principle, once again, is based upon the homomorphic 
properties of the respective scheme. Yen and Laih also note that to remain se- 
cure from attack, the verifier must choose random exponent values and apply 
these during batch verification. These values prevent the signer from attempting 
to introduce false signatures that would otherwise satisfy the batch verification 
criterion (the properties of this test are discussed in more detail in section 1.2). 

Recently, Bellare, Garay, and Rabin [1,2] described several techniques for 
conducting batch verification of exponentiation with high confidence that false 
values have not been mixed into the batch. The technique which they refer to 
as the small exponents test, is very similar to the algorithms of Naccache et al. 
[15] and Yen and Laih [22], while their more sophisticated bucket test turns out 
to be more efficient for larger batch instances. 



1.2 Batch Verification of Exponentiation 

First we give a general idea of how batch verification of exponentiation works 
in a group. Consider the situation where we are given n elements yi, j/ 2 , ■ • • > Vm 
all in a multiplicative group G, and n exponents a;i,a; 2 , . ■ . ,Xn, all integers up 
to some size (we will become more specific shortly). A fixed element g € G is 
known. The idea of batch verification is to check that yi = for each i without 
having to make this explicit calculation n times. In the case that the Xi values 
are indeed the discrete logarithms of the respective yi values we will say that 

^ An earlier version of the paper of Naccache et al. included an additional interactive 
batch verifier. Lim and Lee [12] showed that this version is not secure. 




60 



Colin Boyd and Chris Pavlovski 



the batch is correct. A good batch verification algorithm should identify, at least 
with high probability, whenever one or more of the Xi values is not the discrete 
logarithm of the respective yi. 

All the known batch verification techniques are based on the multiplicative 
property of the group. Specifically, if the batch is correct then the following 
equation holds. 



= ( 1 ) 

i=l 

It is easily checked that the converse is false: if equation 1 holds then it need 
not be the case that the batch is correct. For example, adding a constant to 
one Xi value and subtracting the same constant from a different Xi value does 
not change equation 1 but invalidates the batch. Another example is where the 
correct Xi values are randomly permuted. 

Various authors [15,22] have noticed this and suggested that, to turn equation 
1 into a useful batch verifier, randomisation should be introduced. This is done by 
multiplying the Xi values by small random values which must also be introduced 
as small exponents for the yi values. An attacker who wishes to have an incorrect 
batch accepted has to anticipate which random values will be used. We follow 
Bellare et al. [1] and call this idea the small exponents test. The algorithm is 
shown in table 1. Bellare et al. prove that the small exponents test is a good 
batch verifier with error bounded by 2“* as long as q, the order of the group G, is 
prime. It can be seen that the algorithm uses one full exponentiation in G plus n 
multiplications to obtain x and finally the cost of the n small exponentiations to 
find y. Bellare et al. use a multi-exponentiation algorithm to show that the total 
average cost is l+n{l + l/2) multiplications in addition to the full exponentiation. 



Given: g a generator of the group G of prime order q, and 

(*1, yi), {x2, J/2), . . . , {xn, Vn) with Xi € Zq and yi G G. Also a security parameter 1. 
Check: That Vi G {1, . . . , n} : j/i = 

1. Pick si, . . . , Sn G {0, 1}* at random. 

2. Compute x = ™od q and y = n"=i Vi' ■ 

3. g^ = y then accept, else reject. 



Table 1. Small exponents test for batch verification of exponentiation [1] 



We will concentrate on the small exponents test in this paper. Bellare et 
al. also propose a variation which they call the bucket test which can be more 
efficient for large batches. Our general results apply also to the bucket test and 
we discuss the difference further in section 3.2. 

A critical assumption in the small exponents test is that the yi values lie in 
the group of prime order, G. This rules out the case where G is the multiplicative 
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group Z* for n composite as used in RSA and related algorithms. Nevertheless, 
Bellare et al. have shown that there is a simpler form of verification, which 
they called screening, that applies to RSA signatures^. Screening shows that 
the signatures must have, at some time, been formed by the true owner of the 
private key even though none of the individual claimed signatures might actually 
be correct. Screening is sufficient in applications where it is not necessary to 
possess the signatures, but only to know that the messages were signed; an 
example might be bulk verification of certificates. 

1.3 Central Observation and Contribution 

As mentioned above, it is a requirement in the proof of correctness of the small 
exponents test that all operations are performed within a group G of prime 
order. Bellare et al. suggest that in practice this is not really a restriction as this 
setting is commonplace in many modern cryptographic schemes. 

They observe that when the order of G is not prime the small exponents 
test will not work. For an example they use G = h*, which has non-prime order 
p — 1. Let g he & generator of Z*, and suppose y = g^ mod p. Under these 
assumptions the small exponents test will not detect the invalid batch with two 
pairs {x, —y mod p), (x,y) when the small exponent for the first pair is even, 
which occurs with probability 1/2. Notice that if y lies in some prime order 
subgroup G then —y cannot lie in G. 

The theme of this paper revolves around the requirement of working in a 
prime order group, and can be summarised in two significant observations. 

1. Several authors have ignored this requirement. We give explicit attacks to 
show that their proposed batch verifiers do not work as advertised. 

2. Even when this requirement is stated, it is not usually possible to check effi- 
ciently that it actually holds in a batch presented for verification. This makes 
most applications, including batch verification of DSA signatures [2,15], in- 
appropriate unless additional properties hold. 

The remainder of this paper is structured as follows. In the next section we 
show that the claimed strong RSA batch verifiers proposed by Yen and Laih [22] 
actually provide only the weaker screening property. We also present an explicit 
attack on the batch DSA verifiers of Harn [11], showing that an outsider can forge 
a batch signature for messages of his choosing. In the following section we outline 
a general attack that is applicable to verifiers of signatures in batches, illustrating 
how this may be applied to the small exponents test for batch verification of 
DSA signatures [2,15]. The attack allows the true signer to have false signatures 
accepted by the verifier. We then demonstrate how this general attack may be 
avoided by careful choice of the prime modulus used and give a generalised 
small exponents test for any cyclic group. We finally present a batch verifier for 
modified RSA signatures. 

^ Coron and Naccache [5] pointed out that screening can fail if duplicate messages are 
present. A modified version of screening was later proven correct by Bellare et al. [2]. 
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2 Specific Attacks on Batch Verification Schemes 

In this section we look at two schemes for batch verification which do not operate 
in prime order groups. The first works with a composite modulus, while the 
second performs a modular reduction before verification which destroys the group 
structure. We show that in both cases the verification does not provide the 
assurances claimed. 

2.1 Yen and Laih’s RSA Batch Verification 

Yen and Laih [22] proposed a variation of ElGamal signatures suitable for batch 
signature verification. Here we consider the RSA batch verification technique 
that they devised as a performance comparison with their proposed scheme. 
They have essentially proposed to use the small exponents test in the RSA multi- 
plicative group. Specifically, suppose that Si, . . . , S'„ are claimed RSA signatures 
[18] on messages mi, ... , m„ (where these messages have been pre-processed by 
any chosen hashing and redundancy functions) . If the signatures are correct then 
Si = mf mod N where d is the RSA private exponent and N the modulus. Small 
exponents si, . . . , are chosen randomly of length 1. The batch verification is 
then to test if the following equation holds, where e is the RSA public exponent. 

( n \ ® n 

(2) 

i=l / i=l 

Notice that this test is not as efficient as the small exponents test described 
in table 1 because it is not possible for the verifier to add the exponents on 
the left hand side modulo the group order. Furthermore, in practice a small 
value of e is often used which severely limits the benefit of batch verification. 
For example, if e = 3 then the batch verification can never be as efficient as 
individual verification of the signatures with any reasonable failure probability. 
But regardless of the test’s efficiency it is wrong to assume that is provides more 
than screening; this means that use of the small exponents is redundant since 
Bellare et al. showed that equation 2 provides screening with all Si = 1, at least 
in the case of full domain hashing. 

The simplest attack is to replace some Si values by —Si and some rrn values 
by —rrii (all modulo N). Then the test will still succeed with probability 1/2 
depending on the parity of the Si values chosen. This attack can be launched 
by any party. It can be compounded by the signer who can choose an element 
a of small order t in the multiplicative group (t should be smaller than 2*). 
Any Si value can then be replaced by aSi mod N and the test will succeed with 
probability 1/t. Note that it is easy to find such an a if the factorisation of N is 
known. 

2.2 Ham’s DSA Batch Verification 

Harn [11] proposed an algorithm which is essentially a direct application of 
equation 1 to variants of DSA signatures. Specifically he considers the following 
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signature algorithm. Primes p and q are chosen with q\p — 1 and a generator g 
of the group G of order q is published. A user’s private key is a number a; in 
and the corresponding public key is y = mod p. A signature of a message m 
(again pre-processed by hashing) is a pair (r, s) where both r and s lie in A 
claimed signature pair is correct if the following verification equation holds. 

r = 9 mod p) mod q 

Now suppose that mi , . . . , m„ is a batch of messages with corresponding set 
of claimed signatures (ri, si), . . . , (r„, s„). Applying the multiplicative property, 
the following equation holds, which is also the proposed batch verification test. 



n 

n mod q = <iy^7=i 9 mod p) mod q (3) 

i=l 

Our first observation is that this test can provide no more than screening. 
For suppose that a batch of correct signatures is known. Keep the values the 
same and then choose the n — 1 values s(, . . . , s'^_i randomly and finally solve 
the equation 

n n 

s'^r~^ mod q = Sir~^ mod q 

i=l i=l 

to obtain the value s'^. Then the batch (ri, s'l), . . . , (r„, s'^) satisfies the test but 
almost certainly none of the signatures is correct. 

Now we show that the situation is compromised even further by an explicit 
attack. With high probability it is possible for an attacker who is not the signer 
to find signatures for any chosen message set. We only need to assume that the 
attacker has any known signature for this scheme: this gives values A, B and C 
with A = {g^y^ mod p) mod q. We suppose that the attacker has chosen two 
messages for signing, say mi and m 2 (the attack is easily generalised to any 
number of messages). The attack works by making verification equation 3 the 
same as for the known signature. This is done in two steps. 

1. Solve for ri and V 2 to ensure that . , 

riT2 = A mod q 

rriir^^ + m 2 r^^ = C mod q. 

2. Solve for si and S 2 to ensure that 

Sir)"^ -I- S 2 T 2 ^ = B mod q. 

The simultaneous equations in step 1 can be reduced to the quadratic equa- 
tion {m 2 / A)r\ — Cri+mi mod q which can be solved by completion of the square 
as long as the discriminant — 4mim2/A is a quadratic residue modulo q. On 
the assumption that mi and m 2 are random (they are the result of hashing) this 
will be the case with probability 1/2. Step 2 can then be completed by choosing 
Si randomly and solving for S 2 . 
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The attack can be generalised for any number of messages to be forged. In 
step 1 all but two of the values can be chosen randomly and then the remaining 
two found by solving a quadratic equation as described above. Step 2 proceeds 
as above with all but one of the Si values chosen at random. It is interesting 
to note that this attack will not work if random small exponents are added to 
the verification equation. However, since there is no security proof it would be 
dangerous to rely on such a test. 

3 General Attack on the Small Exponents Test 

In this section we show that the small exponents test described in table 1 is 
much less useful that it at first appears. We will show that many of the proposed 
applications for the test are, in fact, not appropriate at all. 

3.1 Attacking Batch Verification of DSA 

In order to explain the weakness we first describe the batch DSA verification 
proposed by Bellare et al. [2]. (Note that this application was not included in 
the shortened version of the paper published at Eurocrypt’98 [1]). As previously 
suggested by Naccache et al. [15] the verification algorithm is applied not to the 
original DSA signature scheme but to a slightly altered version. 

The setting is again in a subgroup G of Z* of prime order q where a user’s 
private key is a; S with public key y = G G. The signature of a (pre- 
processed) message m is a triple (A, s, m) which satisfies the following verification 
equation, where r = A mod q. 

A = 5 ™"”' « mod p 

The difference in original DSA is that A is replaced by r, and the verification 
equation is reduced modulo q. This means that the original DSA signature is only 
twice the size of q instead of the size of q plus the size of p in the revised version. 
Since typical sizes of p and q would be 1024 and 160 bits respectively, this is 
a significant extra overhead which might be worthwhile for the computational 
gains of batch verification. Note that the modified version can easily be converted 
into an original DSA version at any time by replacing A with r. Bellare et al. 
applied the small exponents test to a batch of modified DSA signatures as shown 
in table 2. 

We now apply our main observation to the algorithm: at no time in the 
algorithm is it checked that the Xi values are actually within the group G as 
they should be. Once this is observed it is straightforward to develop an attack. 
(In contrast to the attack in section 2.2 only the true signer can carry out this 
attack.) Similar to the attack on Yen-Laih’s algorithm in section 2.1, the idea 
of the attack is to replace one or more Xi values by —Xi and the signatures will 
be accepted with probability 1/2. Because the bi values in the test depend on 
Xi the attacking signer needs to choose Xi first and then find Si. Specifically the 
signer proceeds as follows to run the attack with one or more of the messages 
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Given: Public parameters p,q,g a public key y and a batch of claimed signatures: 
(Ai, si, mi), . . . , (An, Sn,mn) with Si € Zq and Ai € G. Also a security parameter 1. 
Check: That Vi e {1, . . . , n} : Ai = q ^ 

1. For i = 1, . . . , n set fli = s~^mi mod q and bi = s~^Xi mod q. 

2. Pick wi, . . . ,w„ G {0, 1}* at random. 

3. Compute A = OiWi mod q, B = biWi mod q, and R = HILi K* ■ 

4. If — R then accept, else reject. 



Table 2. Small exponents test for batch verification of modified DSA [2] 



1. Choose ki randomly in Zg and set Lj = mod p. 

2. Set Ai = —Li mod p, Vi = Ai mod q and Si = k~^(jni + xvi) mod q. 

3. Present {Xi,Si,mi) to the verifier as part of the batch. 

It follows that 



mod qyns, 1 mod 9 j^od p = mod p = L, 

and since L| = Af mod p this will go undetected if the verifier chooses this Wi 
to be even which happens with probability 1/2. 

As with the attack on Yen-Laih, it can be generalised by substituting Ai = 
aLi mod p for an element a with any order t where t|p — 1 and t < 2K Usually 
there will be many such t values that can be chosen. Then the signature will be 
accepted with probability 1/t. 

We would like to emphasise that this does not invalidate the theorem proven 
by Bellare et al. regarding the security of their small exponents test since it is 
an assumption in table 1 that the j/i values are in the group G. Furthermore, 
strictly the application is correct as long as the Ai values are in G, but this is 
not a reasonable assumption in practice. 

3.2 Other Schemes Susceptible to the Attack 

Several other published schemes make essentially the same unjustified assump- 
tion. An attack on the earlier DSA batch verification scheme of Naccache et al. 
[15] is identical to that proposed above. A similar attack on the batch verifier 
for a Schnorr signature variant proposed by Yen and Laih [22] is possible. Note 
that if all (or most) of the small exponents will be chosen to be odd, such as 
is suggested by Naccache et al, the substitution should be made on an even 
number of Xi values for the attack to succeed. 

Another application that is vulnerable is a recent proposal for batch ver- 
ification of coins in Brands’ cash scheme [17]. In this proposal the merchant 
essentially uses the small exponents test during the payment protocol to ver- 
ify a batch of coins together; a batch test is also used by the bank at deposit 
time. A possible consequence of the above attack is that a customer can frame 
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a merchant since there is a high probability that the customer can have a bad 
coin accepted at payment time but that it will be rejected by the bank during 
deposit. 

The alternative bucket test of Bellare et al. [1] is also vulnerable to the same 
attack, since it basically consists of a series of small exponent tests run on random 
partitions of the batch. However, in many instances it will detect the attack with 
much higher probability than the small exponents test. The bucket test uses an 
additional parameter m, repeats the partitioning \l/(m — 1)] times, and runs 
the small exponents test with parameter m in place of L So a value Xi replaced 
by —Xi will be detected with probability 1/2 for every repetition, or 2 ~rC("i-i)l 
overall. This is still much worse than the claimed probability of failure of 2~K 

4 Repairing the Small Exponents Test 

An obvious way to prevent the attack is to check that the Xi values in table 2 are 
indeed in G, as required by the small exponents test. However, there does not 
appear to be any way to do this that does not totally negate the computational 
savings of the test. For example, to test directly that A/ mod p = 1 would require 
n extra exponentiations. Note that it is not sufficient to check, for example, that 
the product of the Xi values are in G. 

The main problem in ensuring that the proof still holds is to avoid elements 
of low order in the ‘large group’. The element of order 2 is always present in Z* 
so we have to accept that there may sign changes in a batch that passes the test. 
In this section we show that through judicious choice of p it is possible to avoid 
any other problems. 

4.1 Dealing with Prime Order Subgroups 

First of all we assume that p is chosen to be of the form p— 1 = 2rq where r and q 
are both primes. The modified form of the small exponents test is shown in table 
3; the differences from that in table 1 are small but significant. In particular there 
is no assumption that the yi values lie in G. A consequence of this difference is 
that exponentiations are only known to be correct up to a possible multiple of -1. 
This should be acceptable in most applications since it can always be corrected 
if a particular value is later found to be incorrect. 

The computational cost of the modified test is identical to that of table 1. 
Using an improved algorithm for multiexponentiation, Bellare et al. [1] calculated 
the total cost of the test as I + n{l + 1/2) multiplications plus the cost of the 
exponentiation. The exact cost will depend on the size of the values of p, q and 
I (as well as the algorithms used for exponentiation and multi-exponentiation). 
Reasonable values today might be \p\ = 1024, jgl = 160 and I = 60. 

Theorem 1. Suppose p is a prime and G a subgroup ofZ* of prime order q. If 
p — 1 = 2qr where r is prime and min{q, r) > 2/ then the algorithm in table 3 is 
a batch verifier which fails with probability at most 2~K 
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Given: g a generator of the subgroup G of Z* and {xi, yi), {x 2 , j/2), • • • , {xn, y-n) with 
Xi € Zq and yi £ Z*. Also a security parameter 1. 

Check: That Vi e {1, . . . , n} : ±yt = 

1. Pick si, . . . , Sn G {0, 1}* at random. 

2. Compute x = ™od q and y = n"=i vt* ■ 

3. If — y then accept, else reject. 



Table 3. Modified small exponents test for batch verification of exponentiation 
in Z* 



Proof The proof is basically similar to that of Bellare et al. for their small 
exponents test but there are a few extra problems to consider. Suppose that go 
is a generator of Z* and suppose, without loss of generality, that g = ^q’’. We 

can then write yi = g^' for some a;' with 1 < x' < p — 1. Suppose that the test 
passes; then the following equation holds. 

2rx mod p— 1 Zif-i mod p— 1 

9o =9o 

Because go is a generator of Z* we have 

2r(sia:i + . . . + s„x„) = x\s\ + . . . + x'^Sn mod (p — 1) 
which we may re-write as the following. 

si(a;^ — 2rxi) s„(a:'„ — 2rx„) mod (p — 1) = 0 (4) 

Suppose that for at least one value of i we have ±yt yf g^\ Without loss of 
generality let us assume that z = 1. If we suppose that the values of S 2 , . . . , s„ 
have been chosen, then equation 4 is a linear equation in si and the number of 
solutions for si is either 0 or z/ = (p — l,2ra;i — x'^). Because p — 1 = 2qr, v 
can take any of the eight values {l,2,q,r,2r,2q,qr,2qr}. But the case u = 2qr 
means that 2ra;i = x'l mod p — 1 so = g^' which we have assumed is not true. 

The next largest case is = qr, so that we have either 2rx\ = x'l mod p — 1 
or 2rx\ + qr = x'l mod p — 1. The former possibility is ruled out and the latter 
possibility means that y\ = g^^ = ^^'’^1+9’’ = —g^^ which is also assumed not to 
hold. 

The remaining cases do not satisfy the check so we need to show that they 
occur with small probability. The next largest case is z^ = 2r. Although in this 
case there are many solutions to equation 4, these solutions are evenly distributed 
in the sense that if X is any solution for si then A -|- g is also a solution. This 
means that there is at most one solution for si in the range 0 < si < 2* since 
q > 2K A similar argument holds for all other possible value of v. Since si is 
chosen randomly the probability that equation 4 holds when ±pi yf g^^ is thus 
at most 2~K The same is then true if all si, . . . , are drawn independently and 
randomly. □ 






